Atomic file/directory renames/moves is the fundamental feature of JuiceFS, which makes it truely a file system rather than a proxy to S3, please check the docs for all the compatibility details [1].
It's doable to run a MinIO gateway on top of CephFS mount point, but that will has performance issue, especially for multipart-upload and copy. That's why we put MinIO and JuiceFS client together and use some internal API to do zero-copy uploads.
Important to note that S3 does not have any Durability SLA. We promise Durability and take it extremely seriously, but there is no SLA. Much more of an SLO
Also, “durability” is not a property you can delegate to another service. Plenty of corruption is caused in-transit, not just at rest.
If your system handles the data in any way, you must compute and validate checksums.
If you do not have end to end checksums for the data, you do not get to claim your service adopts S3’s Durability guarantees.
S3 has that many 9s because your data is checksumed by the SDK. Every service that touches that data in any way recomputes and validates that (or a bracketed) checksum. Soup de nuts. All the way to when the data gets read out again.
And there is a lot more to Durability than data corruption. Protections against accidental deletions, mutations, or other data loss events come into play too. How good is your durability SLO when you accidentally overwrite one customer’s data with another’s?
Check out some of the talks S3 has on what Durability actually means, then maybe you investigate how durable your service is.
ps: I haven’t looked at the code yet, but plan to. Maybe I’m being presumptuous and your service is fully secured. I’ll let you know if I find anything!
pps: I work for amazon but all my opinions are my own and do not necessarily reflect my employer’s. I don’t speak for Amazon in any way :D
As you allude to in your response, that's usually referred to as durability, not reliability. The home page could probably use an update there to reflect that terminology.
It's an average- presumably they don't smear files across disks byte by byte, since that would be insane. But with drives randomly breaking, at some point every copy of at least one file will go at once. With, say, a terabyte of files over a thousand years, you'd expect to lose a total number of files equal to 100Kb. So probably not even one, with some small chance of losing half a drive.
It's unavoidable that too many disk failures in quick succession lead to data-loss. For example if you store two copies, your durability rests on being able to detect a disk failure and create another copy, before the sole remaining version dies as well.
From your article "For example, Azure Files files and folders have default permissions of 0777, with root as owner, and it does not support modification" that sounds like cifs, I can't tell you much about Azure Files NFS but I do know they support basic file rights and setting of uid/gid, my company uses them in Azure for SAP.
Azure has Azure Files and Azrue NetApp Files, the later one is provided from NetApp. Azure Files was used in the article, maybe you are using NetApp Files?
We will update the article to make it clear, thanks!
one more thing, there are two NFS implementations, one v3 and one v4; the v3 is on top of blobs, the v4 one is on top of azure files. we use the v4 one, never tried the blob one :)
https://github.com/juicedata/juicefs#posix-compatibility