Hacker News new | past | comments | ask | show | jobs | submit | more daviesliu's comments login

Atomic file/directory renames/moves is the fundamental feature of JuiceFS, which makes it truely a file system rather than a proxy to S3, please check the docs for all the compatibility details [1].

https://github.com/juicedata/juicefs#posix-compatibility


It's doable to run a MinIO gateway on top of CephFS mount point, but that will has performance issue, especially for multipart-upload and copy. That's why we put MinIO and JuiceFS client together and use some internal API to do zero-copy uploads.


Yes, the data can be encrypted [1] by the client before sending to S3, but the metadata is not encrypted.

[1] https://juicefs.com/docs/community/security/encrypt


The button to switch language is at the bottom of right-top menu, we will fix that.


The button doesn't appear on the mobile theme it seems (went to https://juicefs.com/docs/zh/community/introduction/ after clicking the "Community edition docs" link at the home page).

Maybe consider also changing the link on the English homepage to the English documentation?


Apache Ozone is not POSIX compatible, even with the File System Optimized format [1].

https://ozone.apache.org/docs/current/feature/prefixfso.html


That's a fair point. Thanks!


99.99999999% reliability means you will not loss more than one byte in every 10 GB in a year.

JuiceFS uses S3 as the underlying data storage, so S3 provides this durability SLA.


Important to note that S3 does not have any Durability SLA. We promise Durability and take it extremely seriously, but there is no SLA. Much more of an SLO


Also, “durability” is not a property you can delegate to another service. Plenty of corruption is caused in-transit, not just at rest.

If your system handles the data in any way, you must compute and validate checksums.

If you do not have end to end checksums for the data, you do not get to claim your service adopts S3’s Durability guarantees.

S3 has that many 9s because your data is checksumed by the SDK. Every service that touches that data in any way recomputes and validates that (or a bracketed) checksum. Soup de nuts. All the way to when the data gets read out again.

And there is a lot more to Durability than data corruption. Protections against accidental deletions, mutations, or other data loss events come into play too. How good is your durability SLO when you accidentally overwrite one customer’s data with another’s?

Check out some of the talks S3 has on what Durability actually means, then maybe you investigate how durable your service is.

https://youtu.be/P1gGFYS9LRk

ps: I haven’t looked at the code yet, but plan to. Maybe I’m being presumptuous and your service is fully secured. I’ll let you know if I find anything!

pps: I work for amazon but all my opinions are my own and do not necessarily reflect my employer’s. I don’t speak for Amazon in any way :D


As you allude to in your response, that's usually referred to as durability, not reliability. The home page could probably use an update there to reflect that terminology.


It sounds like not very practical metrics, since losing one byte often makes whole dataset useless (encryption, checksums failures).


It's an average- presumably they don't smear files across disks byte by byte, since that would be insane. But with drives randomly breaking, at some point every copy of at least one file will go at once. With, say, a terabyte of files over a thousand years, you'd expect to lose a total number of files equal to 100Kb. So probably not even one, with some small chance of losing half a drive.


I think probability to lose any data in 100tb should be good metric.


As in there's no durability guarantee for the data? I can expect data loss at a rhythm of 1b per GB per year?


It's unavoidable that too many disk failures in quick succession lead to data-loss. For example if you store two copies, your durability rests on being able to detect a disk failure and create another copy, before the sole remaining version dies as well.


"What do you mean you mean it can't recover from a 100% disk failure rate?

At least it's all in RAID 0, so the data's safe."


Juicedata Inc is a US company, was registered in Delaware. The founding team are Chinese.

ps, I'm the founder of Juicedata.


We had compared nfs of Azure in the article, I believe cifs/samba should be much worse.


From your article "For example, Azure Files files and folders have default permissions of 0777, with root as owner, and it does not support modification" that sounds like cifs, I can't tell you much about Azure Files NFS but I do know they support basic file rights and setting of uid/gid, my company uses them in Azure for SAP.


Azure has Azure Files and Azrue NetApp Files, the later one is provided from NetApp. Azure Files was used in the article, maybe you are using NetApp Files?

We will update the article to make it clear, thanks!


yeah I forgot there is Azure NetApp Files as well, Azure has at least 3 NFS implementations :) But no I was talking about Azure Files NFS https://learn.microsoft.com/en-us/azure/storage/files/files-...


one more thing, there are two NFS implementations, one v3 and one v4; the v3 is on top of blobs, the v4 one is on top of azure files. we use the v4 one, never tried the blob one :)


Yes, we picked the default one in docs, which should be NFS v3, will redo the test against NFS v4 and update the article, thanks!


Yes, JuiceFS is not a good choice for PG, unless if you don't care the performance.

One interesting use case is the backup of MySQL [1].

[1] https://juicefs.com/docs/cloud/backup_mysql_in_juicefs/


The JuiceFS Cloud supports ACL, but open source one does not support it yet.


Anyway to mount with a forced UID/GID of all files?

Useful in container scenarios.


This is an experimental feature to do this, still working on it.


> but open source one does not support it yet.

Is this on the roadmap?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: