Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The big asterisk is, this only works if your database never changes.


Not sure if the parent edited their post after you stated this but note they explained a technique to accommodate database updates / changes / edits.


What do you mean? The author (say wikipedia owners) can change the db as they usually would change (using UPDATE queries say). Those write queries will result in the least-amount of disk-pages updates. In the torrent world this equals a minimum set of pieces modified and needed to be downloaded by users.


Last I checked you can't update a torrent. So if Wikipedia changes even a single letter, you'd need to download all the data once more


No, the pieces you downloaded can be reused for the new torrent download. The pieces will effectively have the same hash hence can be reused for the new digest: http://bittorrent.org/beps/bep_0038.html

This is also why sqlite is a good choice because it's highly optimized to do the least amount of changes to its "pieces" when an update occurs.

If you're implementing this behavior, trying to manage all kinds of different queries, building a querying engine on top of that, optimizing for efficiency and reliability, you're effectively rewriting a database. Sure you can do it, but why not take advantage of battle-tested off-the-shelf stuff for things like "databases" (sqlite) and/or "distributing data" (torrent)?


Actually, there is a solution against this. Just combine https://www.bittorrent.org/beps/bep_0030.html (Merkle-tree-based hashing) with https://www.bittorrent.org/beps/bep_0039.html (Feed-URL based updates), and in some settings also https://www.bittorrent.org/beps/bep_0047.html (Specifically the padding files, so that flat files inside a torrent can also be efficiently shared in arbitrary combinations of non-partial files.).


All those BEPs are in "Draft" status. Okay, libtorrent implements two of them. But also, BEP 39 (Updating Torrents Via Feed URL) doesn't really fit very well into the fully distributed setting because of the centralized URL part.

So now to update the torrent file you need a mechanism for having a mutable document you can update in a distributed but signed way. Or you could make an append only feed of sequential torrent urls... oh wait.

My point is: Hyperdrive's scope is sufficiently different from your proposed solution that yes, you could probably rely on existing tools (and I have much love for bittorrent based solutions!) but it starts feeling like shoehorning the problem into a solution that doesn't quite fit.


The distributed-but-signed way is there in https://www.bittorrent.org/beps/bep_0046.html (Updating Torrents Via DHT Mutable Items).

That draft status is of little practical relevance, though, if nothing changed for years, and no one voiced well-founded critic on the technical details.

I do agree though that Hyperdrive is different from what the bittorrent ecosystem has to offer. I too like not reinventing the wheel where that's not necessary, as you recommend there. I'll leave you the list of BEPs for further reading, in case you're interested: https://www.bittorrent.org/beps/bep_0000.html


I've been keeping an eye on that list for a long time. There's some really cool stuff in there, and I think bittorrent has really been within reach of being "simply good enough for most applications" for quite some time now. And the massive user base is of course a good thing there, especially if you're talking more about archival projects.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: