Hacker News new | past | comments | ask | show | jobs | submit login

I am definitely interested in a streaming backup solution. Right now, our application state is scattered across many independent SQLite databases and files.

We would probably have to look at a rewrite under a unified database schema to leverage something like this (at least for the business state we care about). Streaming replication implies serialization of total business state in my head, and this has some implications for performance.

Also, for us, backup to the cloud is a complete non-starter. We would have to have our customers set up a second machine within the same network (not necessarily same building) to receive these backups due to the sensitive nature of the data.

What I really want to do is keep all the same services & schemas we have today, but build another layer on top so that we can have business services directly aware of replication concerns. For instance, I might want to block on some targeted replication activity rather than let it complete asynchronously. Then, instead of a primary/backup, we can just have 4-5 application nodes operating as a cluster with some sort of scheme copying important entities between nodes as required. We already moved to GUIDs for a lot of identity due to configuration import/export problems, so that problem is solved already. There are very few areas of our application that actually require consensus (if we had multiple participants in the same environment), so this is a compelling path to explore.




You can stream back ups of multiple database files with Litestream. Right now you have to explicitly name them in the Litestream configuration file but in the future it will support using a glob or file pattern to pick up multiple files automatically.

As for cloud backup, that's just one replica type. It's usually the most common so I just state that. Litestream also supports file-based backups so you could do a streaming backup to an NFS mount instead. There's an HTTP replica type coming in v0.4.0 that's mainly for live read replication (e.g. distribute your query load out to multiple servers) but it could also be used as a backup method.

As for synchronous replication, that's something that's on the roadmap but I don't have an exact timeline. It'll probably be v0.5.0. The idea is that you can wait to confirm that data is replicated before returning a confirmation to the client.

We have a Slack[1] as well as a bunch of docs on the site[2] and an active GitHub project page. I do office hours[3] every Friday too if you want to chat over zoom.

[1]: https://join.slack.com/t/litestream/shared_invite/zt-n0j4s3c...

[2]: https://litestream.io/

[3]: https://calendly.com/benbjohnson/litestream


I really like what I am seeing so far. What is the rundown on how synchronous replication would be realized? Feels like I would have to add something to my application for this to work, unless we are talking about modified versions of SQLite or some other process hooking approach.


Litestream maintains a WAL position so it would need to expose the current local WAL position & the highest replicated WAL position via some kind of shared memory—probably just a file similar to SQLite's "-shm" file. The application can check the current position when a transaction starts and then it can block until the transaction has been replicated. That's the basic idea from a high level.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: