It's like any time we take a step forward in one area we have to reinvent the la...

vidarh · on June 1, 2016

Persistent storage isn't a hard problem. Distributed, well performing, scalable and consistent storage is a hard problem.

dap · on June 1, 2016

Actually, persistent storage is fairly hard in itself. Look at what ZFS does to ensure data integrity in the face of phantom writes, dropped writes, bad controllers, and other implicit, non-fatal failures.

vidarh · on June 1, 2016

ZFS still also have the issue of having to perform well. You have a point, but ZFS is still trivial compared to a proper distributed filesystem, and you could achieve the same reliability much easier than ZFS if you sacrificed the performance.

thinkersilver · on June 1, 2016

The ClusterHQ guys behind the FlockerHQ found this out the hard way [0]. Initially Flocker was meant to provide a container data migration tool on top of ZFS, now it is a front-end to more established storage systems like Cinder,vSan,EBS,GPD and so on.

[0] https://docs.clusterhq.com/en/latest/faq/#what-happened-to-z...

dap · on June 1, 2016

Absolutely -- I didn't mean to imply that ZFS even comes close to solving the distributed parts of the problem, but rather that a distributed storage system does have to address the problems of putting bits on disk.

zzzcpan · on June 1, 2016

> Actually, persistent storage is fairly hard in itself

Don't we have distributed data storages precisely because it's impossible to guaranty persistence locally? It's kind of a way to not bother trying to solve the impossible, but to achieve some guarantees on a different level.

wnoise · on June 1, 2016

That's one use case, but far more common is reducing latency and increasing bandwidth for distributed computing acting on a shared set of data.

nickpsecurity · on June 1, 2016

Persistent storage on today's filesystems and complex hardware is a hard problem. All kinds of failures can happen during any write. Some are obvious with some silently corrupting data. There's been decades of work on approaches to dealing with this with a variety of tradeoffs. Picking the right one for a widely-deployed, portable, distributed app is tricky by itself.

They're aiming to do a lot more than that. ;)