More

daviesliu · 2024-11-19T06:20:29 1731997229

Founder of JuiceFS here, congrats to the Launch! I'm super excited to see more people doing creative things in the using-S3-as-file-system space. When we started JuiceFS back in 2017, applied YC for 2 times but no luck.

We are still working hard on it, hoping that we can help people with different workloads with different tech!

huntaub · 2024-11-19T06:22:08 1731997328

Wow, thanks for coming out! I hope that you're heartened to see the number of people who immediately think of JuiceFS when they see our launch. I totally agree with you, storage is such an interesting space to work in, and I'm excited that there are so many great products out there to fit the different needs of customers.

gumbojuice · 2024-11-19T08:46:58 1732006018

As someone who is already happily using JuuceFS, perhaps you can provide a short list of differences (conceptual and/or technical). Thanks for a great product.

dsvf · 2024-11-19T10:53:25 1732013605

I'm a happy and satisfied JuiceFS user here, so I too would be interested in the difference between these. Is the Regatta key point caching?

ChocolateGod · 2024-11-19T11:20:33 1732015233

Also a user of JuiceFS, replaced a GlusterFS cluster a few years ago, far cheaper and easier to scale with no issues or changes needed to the applications using GlusterFS.

huntaub · 2024-11-19T16:21:15 1732033275

I know that I've answered this question a couple times in the thread, so I don't know if my words add extra value here. But, I agree that it would be interesting to hear what Davies is thinking.

dsvf · 2024-11-19T17:20:05 1732036805

Yes, your input into the thread cleared many things up for me, thanks!

daviesliu · 2024-02-25T10:55:43 1708858543

This post said another topic: how to backup the metadata of JuiceFS in readable format (JSON) and restore it into an empty database.

daviesliu · 2024-02-25T10:53:36 1708858416

The recover process is similar to what a database does after a crash, it loads the recent snapshot in disk and apply any newer transaction logs.

daviesliu · 2024-02-23T06:15:18 1708668918

JuiceFS relies on the object store to provide integrity for data. Besides that, JuiceFS stores the checksum of each object as tags in S3, and verifies that when downloading the objects.

Inside the metadata service, it uses merkle tree (hash of hash) to verify the integrity of whole namespace (including id of data blocks) between RAFT replicas. Once we store the hash (4 bytes) of each objects into metadata, it should provide the integrity of the whole namespace.

amluto · 2024-02-23T15:41:27 1708702887

Does JuiceFS allow the user to specify the hash of a file when uploaded? And then to read that hash back later?

Otherwise there’s no end-to-end integrity check.

daviesliu · 2024-02-25T10:51:36 1708858296

The S3 API allow user to specify the hash of content as HTTP header, it will be verified by the JuiceFS gateway and persisted into JuiceFS as ETag.

With POSIX API or HDFS, there is no such API to do that, unfortunately.

markhahn · 2024-02-23T15:58:25 1708703905

surely you mean that the FS should calculate the hash on file creation/update, not take some random value from the user. but I agree that a FS that maintains file-content hash should allow clients to query it.

amluto · 2024-02-23T16:43:10 1708706590

No, the FS should verify the hash on creation/update. Otherwise corruption during creation/update would just cause the hash to match the corrupted data.

daviesliu · 2024-02-23T06:03:28 1708668208

The company may want the product to be difficult to use, so they can sell more support, that's not win-win for both sides.

daviesliu · 2024-02-23T04:13:14 1708661594

Agreed, the all-in-one solution (Ceph) should be better, if you have to setup all the components.

If you already have the infra (databases and object stores), then JuiceFS is the easiest solution to have a distributed file system.

daviesliu · 2024-02-23T04:09:16 1708661356

JuiceFS is similar to HDFS/CephFS/Lustre, so it MUST has a component to manage metadata, similar to NameNode of HDFS or MDS of CephFS, this point of failure is the problem we have to address.

The underlying blob store systems is similar to DataNode or OSD in other distributed file system, could be slower than them a little bit because of the middle layers, the overall performance is determined by the disks.

So we can expect similar performance comparing to HDFS/CephFS, the benchmark results also confirm that.

daviesliu · on Sept 6, 2023

If you really expect a file system experience over GCS, please try JuiceFS [1], which scales to 10 billions of files pretty well with TiKV or FoundationDB as meta engine.

PS, I'm founder of JuiceFS.

[1] https://github.com/juicedata/juicefs

victor106 · on Sept 6, 2023

The description says S3. Does it also support GCS?

8organicbits · on Sept 6, 2023

The architecture image shows GCS and others, so I suspect it does.

https://github.com/juicedata/juicefs#architecture

daviesliu · on Feb 9, 2023

Usually the meta engine or object storage can scale horizontally by itself, JuiceFS is middleware to talk to these two services.

To serve S3 request, you can setup multiple S3 gateway and put a load director in front of them.

hardwaresofton · on Feb 9, 2023

> Usually the meta engine or object storage can scale horizontally by itself, JuiceFS is middleware to talk to these two services.

Thanks for confirming this -- I spent a bunch of time reading and was wondering why I couldn't find anything... this answers my question.

I think I misunderstood JuiceFS -- it's more like a better rclone than it is a distributed file system. It's a swiss army knife for mounting remote filesystems.

Assuming you're using a large object service (S3, GCP, Backblaze, etc) then the scale issue is expected to be solved. If you're using filesystem or local minio for example, then you have to solve the problem yourself.

> To serve S3 request, you can setup multiple S3 gateway and put a load director in front of them.

This is exactly the question I had -- it occurred to me that if I make 2 s3 gateways, even if they share the metadata store they might be looking at resources that only one can serve.

So in this situation:

    [metadata (redis)]-->[ram]
      |
    [s3-0]-->[local disk]
      |
    [s3-1]-->[local disk]

In that situation, then if a request came in to s3-0 for data that was stored @ s3-1, the request would fail, correct? Because s3-0 has no way of redirecting the read request to s3-1.

This could work if you had an intelligent router sitting either in front the s3s (so you could send reads/writes to a certain one of them that is known to have the data), but by default, your writes would fail, I'm assuming.

Oh I have one more question -- can you give options to the sshfs module? It seems like you can just append `?SSHOptionHere=x` to `--bucket` but I'm not sure (ex. `--bucket user@box?SshOption=1`)

daviesliu · on Feb 9, 2023

Agreed, it's very hard, that's why GFS and HDFS had give up some parts of POSIX compatibility.

Per CAP, it's addressed by different meta engines (CP system, Redis, MySQL, TiKV) and also different object stores (AP system). When the meta engine is not available, the operation to JuiceFS will be blocked for a while and finally it returns EIO. When object store returns 404 (object not found), which means it's not consistent with the meta engine, it will be retried for a while, may return EIO if it's not recovered.

The file format is carefully designed to workaround the consistency issue from object store and local cache. Any part of data is written into object store and local cache with unique ID, so you will not go stale data once the metadata is correct [1].

Within a mount point, JuiceFS provides read-after-write consistency. Across clusters, JuiceFS provides open-after-close consistency, which should be enough for most of the applications, also provide good balance between consistency and performance.

[1] https://juicefs.com/docs/community/architecture/#how-juicefs...