Hacker News new | past | comments | ask | show | jobs | submit login

Isn't it a fusion of HSM https://en.wikipedia.org/wiki/Hierarchical_storage_managemen... and continuous backup?

What about often-locally-changed data which are part of a coherent set, the classic case being a file used by a database engine to store data? We nearly always need to mirror/backup a consistent version of it (just after a successful nesting transaction, in the SQL world the upper-level "COMMIT"), but AFAIK for the time being the HSM+backup software cannot detect such a state. trapping existing system calls (fsync and co, in order to copy to the remote storage data in a sync'ed state) but this is not robust because their semantics is not "upon return of this call the whole dataset (in all files) is consistent".

Moreover if the application using the DB engine is not perfect such inconsistency may reside at application level => after a COMMIT the file is consistent for the DB engine, but not for the application.

I wonder if some users of such HSM+backup software felt some major disappointment after restoring an inconsistent version of such a file. Even a minor loss (garbled index) may be hard to detect and lead to a "fork" of the data.

A dedicated system function called to signal "in my set of opened files the data are consistent" would be useful but is AFAIK missing, and even if someone adds it to some libc/kernel it will only be useful when the application code will actually call it.

The kludge is a procedure "order to engine to sync the data ; throttle the engine in 'no write mode' ; create a RO snapshot ; backup the snapshot; unthrottle the engine ; delete the snapshot", which seems not exactly "transparent".




In such a case, you’re better off with a database engine that streams its journal or transaction log to an object store.

Don’t perform data operations at the wrong layer.


Indeed, and this is my point: such tools cannot be generic ("works with any file") and also transparent ("plug & play").


Yes, but those are the preconditions to user adoption.


Author here. Thanks for the Wikipedia link. I think that the software is trying to implement HSM but I didn't know that this is what it's called.

With Zero, all local data is eventually synced to the cloud but usually this only happens after the local file is idle for a while.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: