D1: Improvements to performance and scalability

ccorda · on May 19, 2023

Kenton Varda (tech lead) has some more notes in this twitter thread: https://twitter.com/KentonVarda/status/1659551757796515846

rektide · on May 19, 2023

> Our new engine is based on intercepting SQLite's disk writes and doing clever stuff with them. It was so easy because the file format is not just well designed but amazingly well-documented.

Quite the hobby project for a lot of people too! Other folks doing this:

Rqlite https://hn.algolia.com/?query=Rqlite&sort=byDate , Dqlite https://hn.algolia.com/?query=dqlite&sort=byDate , Litestream https://hn.algolia.com/?query=Litestream&sort=byDate / LiteFS https://hn.algolia.com/?query=LiteFS&sort=byDate, marmot, mvsqlite https://hn.algolia.com/?query=mvsqlite&sort=byDate

otoolep · on May 19, 2023

I've been doing it for almost 10 years. :-)

https://www.philipotoole.com/9-years-of-open-source-database...

aranke · on May 19, 2023

Curious if you prototyped DuckDB before deciding to invest further into SQLite.

DuckDB works great as an in-memory database (it's also the default mode).

kentonv · on May 19, 2023

We didn't, no.

I'm sure there's a lot of really cool local-first databases out there, but SQLite has the benefit of being incredibly widely battle-tested, with literally billions of installations worldwide. It has received thorough security research and fuzzing (it's part of Chrome's attack surface after all). And there's tons of resources online to help people understand how to use it. Although I'm sure there are alternatives that serve certain use cases better it's hard to imagine anything coming close for ours.

That said, the storage engine we've built is not that heavily dependent on SQLite specifically. Any database that uses a write-ahead log like SQLite does should be possible to adapt to it in the future. So maybe we'll eventually open it up to a variety of choices, or even let you bring your own as a Wasm module.

kentonv · on May 19, 2023

Oh, I've been informed that DuckDB uses SQLite under the hood, so maybe compiling DuckDB to Wasm and running it on top of this will be possible, we'll see.

aranke · on May 19, 2023

From https://news.ycombinator.com/item?id=23290512:

> DuckDB is indeed a free columnar database system, but it is not entirely built on top of SQLite. It exposes the same front-end and uses components of SQLite (the shell and testing infrastructure), but the execution engine/storage code is new.

rektide · on May 19, 2023

I expect most workloads are more OLTP/transactional than OLAP/analytical.

elithrar · on May 19, 2023

Correct. DuckDB is really interesting technology, but it's not a direct successor to SQLite for transactional workloads. It's also very new: there's a LOT of new code in DuckDB on top of the (heavily fuzzed) SQLite parts.

(I use it personally, but it's not the same thing as what we're building with D1)

piperswe · on May 19, 2023

Also, for OLAP workloads there's Workers Analytics Engine. Analytics Engine is arbitrarily scalable as opposed to a D1-like solution - I can almost guarantee I've inserted more data into its internal variant (which is built on extremely similar underlying infrastructure) than just about any customer would think of doing, and it handles it like a champ.

Basically, we already have a product for doing DuckDB-like things, and the D1 architecture isn't great for high volume OLAP workloads.

Disclaimer: I'm an engineer at Cloudflare, but not on a developer platform team. I'm not speaking for our developer platform strategy or anything, I'm just commenting on how it looks from where I sit.

radiofmd · on May 20, 2023

Great link & insights from Kenton which shows once again that CF is on the edge (what a pun haha).

Mystery-Machine · on May 19, 2023

"up to 37x faster" is NOT before: 37.81ms, now: 1.82ms

You can't just round down 1.82ms to 1ms.

elithrar · on May 19, 2023

Fixing this. We ran a few benchmarks (and some were much faster than 37x), but this was a more typical case. Not our goal to inflate numbers.

threatofrain · on May 19, 2023

I enjoy Cloudflare and have been using workers + D1 for a few months, but warn that Cloudflare's definition of beta is far more beta than what other companies mean. The API surface may change repeatedly without corresponding documentation and their Discord can be a bit sparse on help.

ignoramous · on May 19, 2023

> ...warn that Cloudflare's definition of beta is far more beta than what other companies mean

D1 was in open alpha prior to today, iirc.

ericstegemann · on May 19, 2023

Eek 0.75 per GB of storage! AWS is 0.115 One of our DBs has 2.3TB in it.

yashap · on May 19, 2023

Yeah, I’d imagine this will be cheaper than AWS RDS for low storage use cases (lack of fixed monthly compute costs wins out), but more expensive for high storage. Like quite cheap for a 10-100 GB DB, quite expensive for a 1-10 TB GB.

Though they do say:

> when we enable global read replication, you won’t have to pay extra for it, nor will replication multiply your storage consumption

With AWS you’ll have at least one read replica for failover, so $0.23/GB. And if you really want global read replicas, with AWS you might end up with something like a primary in North America and read replicas in South America, Europe and Asia. That would work out to $0.46/GB, so closes the gap a bit.

re-thc · on May 20, 2023

Depends if you do more compute than store data. I’ve been in a lot of analytical workloads running different types of reports besides your normal CRUD.

jFriedensreich · on May 19, 2023

at least at the moment D1 is for a completely different use case, d1 has currently a limit of 100mb and will now be increased to 1gb.

Akkuma · on May 19, 2023

The costs as a whole has me worried. I'm not sure it'll be better than just using https://turso.tech/pricing as that is already a free tier of 8GB and it might cost less overall the paid tier.

re-thc · on May 20, 2023

It's 8gb in total including replicas. 1 database with 2 replicas (up to 3 free) would be <3gb in reality. It'll most likely cost more overall.

re-thc · on May 20, 2023

Better than Planetscale at 2.5 per GB and you have to pay for replicas.

davnicwil · on May 19, 2023

So this looks like basically a distributed SQLite with read replication for free across cloudflare's edge. Is that right?

m3kw9 · on May 19, 2023

So serverless meaning you don’t manage the db servers, so they may or may not put your stuff on the edge, cloudflare takes care of maybe load balancing too?

jgrahamc · on May 19, 2023

Yes, Cloudflare handles load balancing and neat tricks like measuring the latency between your users, your code and databases etc. and moving your code around the network to make it run fast: https://blog.cloudflare.com/announcing-workers-smart-placeme...

The vision of the Supercloud is that you give us your code and we'll figure out where and how to execute it: https://blog.cloudflare.com/welcome-to-the-supercloud-and-de...

threatofrain · on May 20, 2023

IMO serverless is more about the pricing model more than whether a service is managed. Both Amazon RDS and Aurora are managed services, but Aurora is the one with the serverless pricing model.

pottertheotter · on May 19, 2023

So when would someone use something like this? I learn better by example if anyone has any.

And for reference, here’s the original D1 announcement with some additional info https://blog.cloudflare.com/introducing-d1/

kentonv · on May 19, 2023

You'd use it whenever you're building an application on Cloudflare Workers and you need to store data. D1 provides you with a SQL database. It's based on SQLite, so it's designed for relatively small datasets but can serve them very quickly from the edge.

Note there are several alternatives here, too. Workers Durable Objects[0] provide a lower-level primitive for building advanced distributed systems. But D1 is easier to use for typical use cases. For blob storage you might use R2[1]. And for large databases Workers can easily integrate with several serverless database providers.[2]

[0] https://blog.cloudflare.com/introducing-workers-durable-obje... [1] https://www.cloudflare.com/products/r2/ [2] https://blog.cloudflare.com/announcing-database-integrations...

Akkuma · on May 19, 2023

Is it a non-goal to be long term usable for larger databases? That would force the usage of something like turso your closest direct comparison as a possible migration strategy or relying on "Smart Placement" (which from my point of view reduces the benefit of global edge) for other serverless non-global dbs.

kentonv · on May 19, 2023

Personally, I'm a firm believer that most "web app" use cases are better served by many small databases (e.g. per-user or per-document) rather than a single monolithic databases. This is especially true when serving users all around the world -- per-user databases can be located near each user (both for speed and to comply with data locality laws).

What I'd like to enable here is a progression where you start out prototyping your app with a single D1 database, which is easy to use and reasonably fast. Then as you grow we provide tools to let you transition to many D1 databases sharded in a way that makes sense (e.g. per-user). Apps that want even more control can move to using full-on Durable Objects (which will soon support a SQLite database per-object).

That said, there are certainly many use cases out there where simple monoliths make sense, especially non-interactive data crunching. I'm not sure yet if D1 will ever be the right choice for those, but the Workers platform aims to provide many options.

Akkuma · on May 19, 2023

Thanks for the insight, I greatly appreciate it! This definitely is a reasonable idea for many things and I'm looking forward to seeing something similar to the sharding mechanism in the future.

I've only started to think about this and I'm thinking the hardest part will be dealing with cross-cutting concerns (in a non-auto sharded world manually creating multiple database) and trying to find a way to keep each database isolated without extra burden compared to using a hosted Postgres.

As an aside, that lan optimized house was a gaming dream. Hope your new house is as awesome.

jschlesser · on May 20, 2023

I've been working on a little project that uses D1 and have been hoping that is how D1 will evolve. A db per customer in my case. Is the colocation in a durable object so 'business logic' and sqlite can live in the same isolate for performance and security? Would that be a post 'out of beta' feature? Ive been building as a monolith thinking that the number of databases in the alpha was indicative of the future.

Wallacy · on May 19, 2023

" Apps that want even more control can move to using full-on Durable Objects (which will soon support a SQLite database per-object)."

Can you elaborate this little bit more? Im using DO today and i have a bad time sharding my data (works, but i hate it);

So i will have the option to use the standard store or/and SQLite?

If so, i dont can keep with my DO (because i have control of everything) and use SQLite for things that is bigger than what the value store supports.

kentonv · on May 19, 2023

Sorry, I don't quite understand what you're asking.

In the future each DO will have a private SQLite database. The key/value store will actually be redirected to store into a special table in this database, but probably new apps will just use the database and not the KV store.

Separately from that, I would like to develop tools that make sharding Durable Objects (and D1 databases) easier. Today it's a pain to do manually. This is independent from the underlying storage model, though.

jgrahamc · on May 19, 2023

I mean, our object storage is called R2, our first database offering is called D1, and if we were to offer a fully Distributed Database then D2 seems like a good name.

Akkuma · on May 19, 2023

And you can follow up D2 with D2: Lord of Distributed.

fyzix · on May 19, 2023

Cloudflare are 285 pops. Surely you dont need your db replication them all of them. A few locations per continent should suffice.

For comparison, fly.io, turso's provider, has 34 locations and well-documented reliability issues.

Akkuma · on May 19, 2023

That is a fair point. A few centralized locations will likely be more than sufficient for most use cases.

leetrout · on May 19, 2023

I use D1 through microfeed https://www.microfeed.org/

kneebonian · on May 19, 2023

So I hate cloudflare with a burning passion, at some point cloudflare decided me home IP was bad and has started flagging it all over the place, which leads to vast swathes of the internet now being inaccessible to me.

There is danger in centralized systems.

kentonv · on May 19, 2023

Have you tried using Privacy Pass? It is a browser extension which helps prove that your traffic is legitimate which should reduce the incidence of challenges from Cloudflare, even if your IP has bad rep. It uses advanced cryptography to do this without creating any new tracking vectors.

(Disclosure/disclaimer: I work for Cloudflare but in a different department; I'm not an expert on Privacy Pass.)

simiones · on May 19, 2023

> It uses advanced cryptography to do this without creating any new tracking vectors.

Sounds like something which is designed to create hard to detect tracking vectors.

kentonv · on May 19, 2023

Note that Privacy Pass is not a Cloudflare product, it's an open protocol which Cloudflare supports.

https://datatracker.ietf.org/doc/charter-ietf-privacypass/

ignoramous · on May 19, 2023

Blog: https://blog.cloudflare.com/privacy-pass-the-math/

(older) Design: https://privacypass.github.io/protocol/

PATs: https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

simiones · on May 22, 2023

So this is even more insiduos, and an interesting way for Apple, Google, and MS to pay Cloudflare to use its dominating market position to help safeguard them from other incumbents: you either work to get your devices approved in this scheme, or your users get more Cloudflare CAPTCHAs preventing them from having a nice web experience.

All in the name of "privacy" and "security". Though of course, the only security that is served here is the ad network's revenue stream, who are the only ones who really care if a real device or a bot is accessing their services. For most other cases, a DDoS is a DDoS regardless of it being initiated by actual users (e.g. an HN hug of death) or by a bot network.

depingus · on May 19, 2023

When I started using Firefox with the Temporary Containers extension I would get absolutely bombarded by Cloudflare's captchas. To the point where if I clicked on a search result that sent me to a captcha, I would just close the tab and click the next result (which often resulted in another captcha, rinse repeat). That's when I realized just how big Cloudflare had gotten.

I still use Temporary Containers and lately, I've noticed a sharp decline in these Cloudflare captchas. I don't know if its because people are moving away from them, or Cloudflare just found a better way to finger print me.

re-thc · on May 20, 2023

The issue I have here is why does Cloudflare take the blame? There are 2 parties (well could be more) - your ISP and Cloudflare. Who says it's not your ISP that is the problematic 1?

People like to blame the easiest target.

bombcar · on May 19, 2023

You may be able to force your ISP to give you a new IP by resetting your modem or leaving it off long enough, that may help.

But once cloud flare hates you, you have to bow to them.

Try browsing the normal web over Tor sometime and see how bad it can get.

v0idzer0 · on May 19, 2023

Change your IP? Most ISP change every modem reboot. Others you can call and request a change

fire · on May 19, 2023

hmm, is this roughly equivalent to Neon[1], but sqlite based rather than postgres?

1: https://neon.tech

nikita · on May 19, 2023

(neon CEO)

This is true. Neon however offers bottomless storage and D1 is 100Mb currently going to 1Gb.

radiofmd · on May 20, 2023

Nah, D1 is in a different league.[1] One key feature of D1 is having free read replicas around the globe. That's CF's DNA and I expect this to happen this or latest Q1/24, and again, for free (so, without any surcharge). Re storage: You get ofc more than 1GB on the paid plan.

Neon doesn't have read replicas and even if they already work on it, I wouldn't expect it before 2025 if at all and never at CF's pricing (Neon still charges for egress).

[1] I would compare D1 rather to Turso or LiteFS from Fly or PlanetScale with many read-replicas

re-thc · on May 20, 2023

Neon also charges for compute.

radiofmd · on May 20, 2023

This. If we wanted opaque pricing with egress, compute and end-of-month-surprises I'd go with AWS or Google Cloud.

trollied · on May 19, 2023

Interesting, Time Travel works the same as Oracle Flashback. Hope there aren't any patents to trip over.

marcopicentini · on May 19, 2023

Why and when should I use a serverless Postgresql instead of a postresql hosted on a server?

rozenmd · on May 19, 2023

this isn't postgres

colesantiago · on May 19, 2023

Can I connect directly to a D1 database with a SQLite URL? That would be awesome.

radiofmd · on May 20, 2023

I am pretty excited about D1 and in general a CF fanboy, so congrats to this huge milestione. But, we need more--following is just a wish list:

- Development since start took time (with a good reason), hope we see soon read replication and higher pace

- So, some roadmap _with_ ETAs would be great, e.g., will replication come this or next year?

- I don't care about read performance because I know that sqlite paired with CF's infrastructur will perform but I would love to have more infos on write performance; this is really sqlite's weakest point and I haven't seen any implementation which could compete anywhere with other dbs; sqlite slows down very quickly when doing a couple of small writes/sec

Then, I'd like to see the close competition here on HN, so db-providers with many read-replicas (CEO's and/or devs from Turso and Fly/LiteFS), commenting on D1 and how they compare against and what they plan to compete. This is a too exciting space and time to be laissez-faire.

benbjohnson · on May 20, 2023

LiteFS author here. I’d love to compare and contrast D1 with how LiteFS works but there’s not enough technical details at the moment. I’m looking forward to future technical blog posts from Cloudflare or perhaps they could open source their implementation.

There are a lot of ways to approach SQLite replication and one isn’t better than another necessarily. They simply have different trade-offs.

pier25 · on May 19, 2023

Has the DX of Workers improved?

I think Deno is lightyears ahead.

kentonv · on May 19, 2023

Hard to say without knowing what specific problem you have had but we are improving all the time. Several improvements announced just this week, take a look at the blog.

Among other things, we made Wrangler (Workers CLI tool) use the open source workerd by default for local development, so local dev should produce a much more precise simulation now (since it's literally running the same code).

pier25 · on May 19, 2023

And what about the DX of using Workers with Pages?

I tried to use that recently and it was a disaster. I wrote about my experience here:

https://twitter.com/pierbover/status/1641474067013271552

I then opened these two issues:

https://github.com/cloudflare/workers-sdk/issues/2962

https://github.com/cloudflare/workers-sdk/issues/2964

I ended up moving the project over to Netlify + Edge functions. I had it all working in like 5-10 mins as it should. Took me two hours to figure out why Workers weren't working in my Pages project, and could never get Workers working properly with my Astro project.

I think you're working exclusively on the engine of Workers which is really top notch, but Cloudflare really needs to improve the outer layer which affects DX considerably.

kentonv · on May 19, 2023

Sorry you experienced that. FWIW this announcement from Wednesday should help address the problems you ran into:

https://blog.cloudflare.com/pages-and-workers-are-converging...

mscccc · on May 19, 2023

I've been using workers on & off since it launched. Just tried it again recently and the local dev experience with wrangler is excellent now.

KRAKRISMOTT · on May 19, 2023

Still no compatibility with standard ORMs like Prisma either.

And the team has been aware of the issue for years now

https://github.com/cloudflare/workers-sdk/issues/2701

https://news.ycombinator.com/item?id=31341513

kentonv · on May 19, 2023

Sorry that D1 has been slow out the gate. Now that we've solved the basic technical issues we can really focus on improving DX.

igalklebanov · on May 20, 2023

You can use D1 with Kysely (the typesafe sql query builder) via the kysely-d1 3rd party dialect.

https://github.com/kysely-org/kysely

https://github.com/aidenwallis/kysely-d1

Akkuma · on May 19, 2023

You can use drizzle-orm with Cloudflare and that is fully compatible.

rozenmd · on May 19, 2023

Drizzle is amazing.

Akkuma · on May 19, 2023

This is my first time using it and I've been very pleased with it so far. It keeps it simple, has solid typing & schema building, and reminds me of LINQ. I'm also a thin models kind of person and the fact this is just an object without needing to build ORM classes is even better.

radiofmd · on May 20, 2023

> standard ORMs like Prisma

Prisma isn't standard and too slow anyway.

re-thc · on May 20, 2023

They recently have a new json protocol that’s faster.

radiofmd · on May 20, 2023

Still way to slow (coldstart, throughput and general latency), I was referring to the latest version which has still a convoluted, nonsense stack. I'd stay away from Prisma, there's enough info online.

nemothekid · on May 19, 2023

This is Cloudflare's flavor of Fly.io?

pier25 · on May 19, 2023

Fly doesn't really offer distributed data.

Edit:

It does!

avinassh · on May 19, 2023

is this not same - https://fly.io/docs/litefs/ ? They mention it as 'LiteFS - Distributed SQLite'

pier25 · on May 19, 2023

Ah you're right!

I thought this was a wrapper for Litestream but apparently it's a parallel project by the same author who Fly hired.

https://litestream.io/

https://github.com/benbjohnson/litestream/pull/411

re-thc · on May 20, 2023

Fly requires you to commit compute and Fly's compute gets expensive real fast.