Hacker News new | past | comments | ask | show | jobs | submit login
D1: Improvements to performance and scalability (cloudflare.com)
147 points by eallam on May 19, 2023 | hide | past | favorite | 79 comments



Kenton Varda (tech lead) has some more notes in this twitter thread: https://twitter.com/KentonVarda/status/1659551757796515846


> Our new engine is based on intercepting SQLite's disk writes and doing clever stuff with them. It was so easy because the file format is not just well designed but amazingly well-documented.

Quite the hobby project for a lot of people too! Other folks doing this:

Rqlite https://hn.algolia.com/?query=Rqlite&sort=byDate , Dqlite https://hn.algolia.com/?query=dqlite&sort=byDate , Litestream https://hn.algolia.com/?query=Litestream&sort=byDate / LiteFS https://hn.algolia.com/?query=LiteFS&sort=byDate, marmot, mvsqlite https://hn.algolia.com/?query=mvsqlite&sort=byDate


I've been doing it for almost 10 years. :-)

https://www.philipotoole.com/9-years-of-open-source-database...


Curious if you prototyped DuckDB before deciding to invest further into SQLite.

DuckDB works great as an in-memory database (it's also the default mode).


We didn't, no.

I'm sure there's a lot of really cool local-first databases out there, but SQLite has the benefit of being incredibly widely battle-tested, with literally billions of installations worldwide. It has received thorough security research and fuzzing (it's part of Chrome's attack surface after all). And there's tons of resources online to help people understand how to use it. Although I'm sure there are alternatives that serve certain use cases better it's hard to imagine anything coming close for ours.

That said, the storage engine we've built is not that heavily dependent on SQLite specifically. Any database that uses a write-ahead log like SQLite does should be possible to adapt to it in the future. So maybe we'll eventually open it up to a variety of choices, or even let you bring your own as a Wasm module.


Oh, I've been informed that DuckDB uses SQLite under the hood, so maybe compiling DuckDB to Wasm and running it on top of this will be possible, we'll see.


From https://news.ycombinator.com/item?id=23290512:

> DuckDB is indeed a free columnar database system, but it is not entirely built on top of SQLite. It exposes the same front-end and uses components of SQLite (the shell and testing infrastructure), but the execution engine/storage code is new.


I expect most workloads are more OLTP/transactional than OLAP/analytical.


Correct. DuckDB is really interesting technology, but it's not a direct successor to SQLite for transactional workloads. It's also very new: there's a LOT of new code in DuckDB on top of the (heavily fuzzed) SQLite parts.

(I use it personally, but it's not the same thing as what we're building with D1)


Also, for OLAP workloads there's Workers Analytics Engine. Analytics Engine is arbitrarily scalable as opposed to a D1-like solution - I can almost guarantee I've inserted more data into its internal variant (which is built on extremely similar underlying infrastructure) than just about any customer would think of doing, and it handles it like a champ.

Basically, we already have a product for doing DuckDB-like things, and the D1 architecture isn't great for high volume OLAP workloads.

Disclaimer: I'm an engineer at Cloudflare, but not on a developer platform team. I'm not speaking for our developer platform strategy or anything, I'm just commenting on how it looks from where I sit.


Great link & insights from Kenton which shows once again that CF is on the edge (what a pun haha).


"up to 37x faster" is NOT before: 37.81ms, now: 1.82ms

You can't just round down 1.82ms to 1ms.


Fixing this. We ran a few benchmarks (and some were much faster than 37x), but this was a more typical case. Not our goal to inflate numbers.


I enjoy Cloudflare and have been using workers + D1 for a few months, but warn that Cloudflare's definition of beta is far more beta than what other companies mean. The API surface may change repeatedly without corresponding documentation and their Discord can be a bit sparse on help.


> ...warn that Cloudflare's definition of beta is far more beta than what other companies mean

D1 was in open alpha prior to today, iirc.


Eek 0.75 per GB of storage! AWS is 0.115 One of our DBs has 2.3TB in it.


Yeah, I’d imagine this will be cheaper than AWS RDS for low storage use cases (lack of fixed monthly compute costs wins out), but more expensive for high storage. Like quite cheap for a 10-100 GB DB, quite expensive for a 1-10 TB GB.

Though they do say:

> when we enable global read replication, you won’t have to pay extra for it, nor will replication multiply your storage consumption

With AWS you’ll have at least one read replica for failover, so $0.23/GB. And if you really want global read replicas, with AWS you might end up with something like a primary in North America and read replicas in South America, Europe and Asia. That would work out to $0.46/GB, so closes the gap a bit.


Depends if you do more compute than store data. I’ve been in a lot of analytical workloads running different types of reports besides your normal CRUD.


at least at the moment D1 is for a completely different use case, d1 has currently a limit of 100mb and will now be increased to 1gb.


The costs as a whole has me worried. I'm not sure it'll be better than just using https://turso.tech/pricing as that is already a free tier of 8GB and it might cost less overall the paid tier.


It's 8gb in total including replicas. 1 database with 2 replicas (up to 3 free) would be <3gb in reality. It'll most likely cost more overall.


Better than Planetscale at 2.5 per GB and you have to pay for replicas.


So this looks like basically a distributed SQLite with read replication for free across cloudflare's edge. Is that right?


So serverless meaning you don’t manage the db servers, so they may or may not put your stuff on the edge, cloudflare takes care of maybe load balancing too?


Yes, Cloudflare handles load balancing and neat tricks like measuring the latency between your users, your code and databases etc. and moving your code around the network to make it run fast: https://blog.cloudflare.com/announcing-workers-smart-placeme...

The vision of the Supercloud is that you give us your code and we'll figure out where and how to execute it: https://blog.cloudflare.com/welcome-to-the-supercloud-and-de...


IMO serverless is more about the pricing model more than whether a service is managed. Both Amazon RDS and Aurora are managed services, but Aurora is the one with the serverless pricing model.


So when would someone use something like this? I learn better by example if anyone has any.

And for reference, here’s the original D1 announcement with some additional info https://blog.cloudflare.com/introducing-d1/


You'd use it whenever you're building an application on Cloudflare Workers and you need to store data. D1 provides you with a SQL database. It's based on SQLite, so it's designed for relatively small datasets but can serve them very quickly from the edge.

Note there are several alternatives here, too. Workers Durable Objects[0] provide a lower-level primitive for building advanced distributed systems. But D1 is easier to use for typical use cases. For blob storage you might use R2[1]. And for large databases Workers can easily integrate with several serverless database providers.[2]

[0] https://blog.cloudflare.com/introducing-workers-durable-obje... [1] https://www.cloudflare.com/products/r2/ [2] https://blog.cloudflare.com/announcing-database-integrations...


Is it a non-goal to be long term usable for larger databases? That would force the usage of something like turso your closest direct comparison as a possible migration strategy or relying on "Smart Placement" (which from my point of view reduces the benefit of global edge) for other serverless non-global dbs.


Personally, I'm a firm believer that most "web app" use cases are better served by many small databases (e.g. per-user or per-document) rather than a single monolithic databases. This is especially true when serving users all around the world -- per-user databases can be located near each user (both for speed and to comply with data locality laws).

What I'd like to enable here is a progression where you start out prototyping your app with a single D1 database, which is easy to use and reasonably fast. Then as you grow we provide tools to let you transition to many D1 databases sharded in a way that makes sense (e.g. per-user). Apps that want even more control can move to using full-on Durable Objects (which will soon support a SQLite database per-object).

That said, there are certainly many use cases out there where simple monoliths make sense, especially non-interactive data crunching. I'm not sure yet if D1 will ever be the right choice for those, but the Workers platform aims to provide many options.


Thanks for the insight, I greatly appreciate it! This definitely is a reasonable idea for many things and I'm looking forward to seeing something similar to the sharding mechanism in the future.

I've only started to think about this and I'm thinking the hardest part will be dealing with cross-cutting concerns (in a non-auto sharded world manually creating multiple database) and trying to find a way to keep each database isolated without extra burden compared to using a hosted Postgres.

As an aside, that lan optimized house was a gaming dream. Hope your new house is as awesome.


I've been working on a little project that uses D1 and have been hoping that is how D1 will evolve. A db per customer in my case. Is the colocation in a durable object so 'business logic' and sqlite can live in the same isolate for performance and security? Would that be a post 'out of beta' feature? Ive been building as a monolith thinking that the number of databases in the alpha was indicative of the future.


" Apps that want even more control can move to using full-on Durable Objects (which will soon support a SQLite database per-object)."

Can you elaborate this little bit more? Im using DO today and i have a bad time sharding my data (works, but i hate it);

So i will have the option to use the standard store or/and SQLite?

If so, i dont can keep with my DO (because i have control of everything) and use SQLite for things that is bigger than what the value store supports.


Sorry, I don't quite understand what you're asking.

In the future each DO will have a private SQLite database. The key/value store will actually be redirected to store into a special table in this database, but probably new apps will just use the database and not the KV store.

Separately from that, I would like to develop tools that make sharding Durable Objects (and D1 databases) easier. Today it's a pain to do manually. This is independent from the underlying storage model, though.


I mean, our object storage is called R2, our first database offering is called D1, and if we were to offer a fully Distributed Database then D2 seems like a good name.


And you can follow up D2 with D2: Lord of Distributed.


Cloudflare are 285 pops. Surely you dont need your db replication them all of them. A few locations per continent should suffice.

For comparison, fly.io, turso's provider, has 34 locations and well-documented reliability issues.


That is a fair point. A few centralized locations will likely be more than sufficient for most use cases.


I use D1 through microfeed https://www.microfeed.org/


So I hate cloudflare with a burning passion, at some point cloudflare decided me home IP was bad and has started flagging it all over the place, which leads to vast swathes of the internet now being inaccessible to me.

There is danger in centralized systems.


Have you tried using Privacy Pass? It is a browser extension which helps prove that your traffic is legitimate which should reduce the incidence of challenges from Cloudflare, even if your IP has bad rep. It uses advanced cryptography to do this without creating any new tracking vectors.

(Disclosure/disclaimer: I work for Cloudflare but in a different department; I'm not an expert on Privacy Pass.)


> It uses advanced cryptography to do this without creating any new tracking vectors.

Sounds like something which is designed to create hard to detect tracking vectors.


Note that Privacy Pass is not a Cloudflare product, it's an open protocol which Cloudflare supports.

https://datatracker.ietf.org/doc/charter-ietf-privacypass/



So this is even more insiduos, and an interesting way for Apple, Google, and MS to pay Cloudflare to use its dominating market position to help safeguard them from other incumbents: you either work to get your devices approved in this scheme, or your users get more Cloudflare CAPTCHAs preventing them from having a nice web experience.

All in the name of "privacy" and "security". Though of course, the only security that is served here is the ad network's revenue stream, who are the only ones who really care if a real device or a bot is accessing their services. For most other cases, a DDoS is a DDoS regardless of it being initiated by actual users (e.g. an HN hug of death) or by a bot network.


When I started using Firefox with the Temporary Containers extension I would get absolutely bombarded by Cloudflare's captchas. To the point where if I clicked on a search result that sent me to a captcha, I would just close the tab and click the next result (which often resulted in another captcha, rinse repeat). That's when I realized just how big Cloudflare had gotten.

I still use Temporary Containers and lately, I've noticed a sharp decline in these Cloudflare captchas. I don't know if its because people are moving away from them, or Cloudflare just found a better way to finger print me.


The issue I have here is why does Cloudflare take the blame? There are 2 parties (well could be more) - your ISP and Cloudflare. Who says it's not your ISP that is the problematic 1?

People like to blame the easiest target.


You may be able to force your ISP to give you a new IP by resetting your modem or leaving it off long enough, that may help.

But once cloud flare hates you, you have to bow to them.

Try browsing the normal web over Tor sometime and see how bad it can get.


Change your IP? Most ISP change every modem reboot. Others you can call and request a change


hmm, is this roughly equivalent to Neon[1], but sqlite based rather than postgres?

1: https://neon.tech


(neon CEO)

This is true. Neon however offers bottomless storage and D1 is 100Mb currently going to 1Gb.


Nah, D1 is in a different league.[1] One key feature of D1 is having free read replicas around the globe. That's CF's DNA and I expect this to happen this or latest Q1/24, and again, for free (so, without any surcharge). Re storage: You get ofc more than 1GB on the paid plan.

Neon doesn't have read replicas and even if they already work on it, I wouldn't expect it before 2025 if at all and never at CF's pricing (Neon still charges for egress).

[1] I would compare D1 rather to Turso or LiteFS from Fly or PlanetScale with many read-replicas


Neon also charges for compute.


This. If we wanted opaque pricing with egress, compute and end-of-month-surprises I'd go with AWS or Google Cloud.


Interesting, Time Travel works the same as Oracle Flashback. Hope there aren't any patents to trip over.


Why and when should I use a serverless Postgresql instead of a postresql hosted on a server?


this isn't postgres


Can I connect directly to a D1 database with a SQLite URL? That would be awesome.


I am pretty excited about D1 and in general a CF fanboy, so congrats to this huge milestione. But, we need more--following is just a wish list:

- Development since start took time (with a good reason), hope we see soon read replication and higher pace

- So, some roadmap _with_ ETAs would be great, e.g., will replication come this or next year?

- I don't care about read performance because I know that sqlite paired with CF's infrastructur will perform but I would love to have more infos on write performance; this is really sqlite's weakest point and I haven't seen any implementation which could compete anywhere with other dbs; sqlite slows down very quickly when doing a couple of small writes/sec

Then, I'd like to see the close competition here on HN, so db-providers with many read-replicas (CEO's and/or devs from Turso and Fly/LiteFS), commenting on D1 and how they compare against and what they plan to compete. This is a too exciting space and time to be laissez-faire.


LiteFS author here. I’d love to compare and contrast D1 with how LiteFS works but there’s not enough technical details at the moment. I’m looking forward to future technical blog posts from Cloudflare or perhaps they could open source their implementation.

There are a lot of ways to approach SQLite replication and one isn’t better than another necessarily. They simply have different trade-offs.


Has the DX of Workers improved?

I think Deno is lightyears ahead.


Hard to say without knowing what specific problem you have had but we are improving all the time. Several improvements announced just this week, take a look at the blog.

Among other things, we made Wrangler (Workers CLI tool) use the open source workerd by default for local development, so local dev should produce a much more precise simulation now (since it's literally running the same code).


And what about the DX of using Workers with Pages?

I tried to use that recently and it was a disaster. I wrote about my experience here:

https://twitter.com/pierbover/status/1641474067013271552

I then opened these two issues:

https://github.com/cloudflare/workers-sdk/issues/2962

https://github.com/cloudflare/workers-sdk/issues/2964

I ended up moving the project over to Netlify + Edge functions. I had it all working in like 5-10 mins as it should. Took me two hours to figure out why Workers weren't working in my Pages project, and could never get Workers working properly with my Astro project.

I think you're working exclusively on the engine of Workers which is really top notch, but Cloudflare really needs to improve the outer layer which affects DX considerably.


Sorry you experienced that. FWIW this announcement from Wednesday should help address the problems you ran into:

https://blog.cloudflare.com/pages-and-workers-are-converging...


I've been using workers on & off since it launched. Just tried it again recently and the local dev experience with wrangler is excellent now.


Still no compatibility with standard ORMs like Prisma either.

And the team has been aware of the issue for years now

https://github.com/cloudflare/workers-sdk/issues/2701

https://news.ycombinator.com/item?id=31341513


Sorry that D1 has been slow out the gate. Now that we've solved the basic technical issues we can really focus on improving DX.


You can use D1 with Kysely (the typesafe sql query builder) via the kysely-d1 3rd party dialect.

https://github.com/kysely-org/kysely

https://github.com/aidenwallis/kysely-d1


You can use drizzle-orm with Cloudflare and that is fully compatible.


Drizzle is amazing.


This is my first time using it and I've been very pleased with it so far. It keeps it simple, has solid typing & schema building, and reminds me of LINQ. I'm also a thin models kind of person and the fact this is just an object without needing to build ORM classes is even better.


> standard ORMs like Prisma

Prisma isn't standard and too slow anyway.


They recently have a new json protocol that’s faster.


Still way to slow (coldstart, throughput and general latency), I was referring to the latest version which has still a convoluted, nonsense stack. I'd stay away from Prisma, there's enough info online.


This is Cloudflare's flavor of Fly.io?


Fly doesn't really offer distributed data.

Edit:

It does!


is this not same - https://fly.io/docs/litefs/ ? They mention it as 'LiteFS - Distributed SQLite'


Ah you're right!

I thought this was a wrapper for Litestream but apparently it's a parallel project by the same author who Fly hired.

https://litestream.io/

https://github.com/benbjohnson/litestream/pull/411


Fly requires you to commit compute and Fly's compute gets expensive real fast.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: