> JuiceFS, written in Go, can manage tens of billions of files in a single names...

daviesliu · 2024-02-23T06:15:18 1708668918

JuiceFS relies on the object store to provide integrity for data. Besides that, JuiceFS stores the checksum of each object as tags in S3, and verifies that when downloading the objects.

Inside the metadata service, it uses merkle tree (hash of hash) to verify the integrity of whole namespace (including id of data blocks) between RAFT replicas. Once we store the hash (4 bytes) of each objects into metadata, it should provide the integrity of the whole namespace.

amluto · 2024-02-23T15:41:27 1708702887

Does JuiceFS allow the user to specify the hash of a file when uploaded? And then to read that hash back later?

Otherwise there’s no end-to-end integrity check.

daviesliu · 2024-02-25T10:51:36 1708858296

The S3 API allow user to specify the hash of content as HTTP header, it will be verified by the JuiceFS gateway and persisted into JuiceFS as ETag.

With POSIX API or HDFS, there is no such API to do that, unfortunately.

markhahn · 2024-02-23T15:58:25 1708703905

surely you mean that the FS should calculate the hash on file creation/update, not take some random value from the user. but I agree that a FS that maintains file-content hash should allow clients to query it.

amluto · 2024-02-23T16:43:10 1708706590

No, the FS should verify the hash on creation/update. Otherwise corruption during creation/update would just cause the hash to match the corrupted data.

yjftsjthsd-h · 2024-02-23T16:54:57 1708707297

I was under the impression that minio with erasure coding did that? ( https://min.io/docs/minio/linux/operations/concepts/erasure-... )

amluto · 2024-02-23T19:42:47 1708717367

Looks like minio added this in 2022:

https://github.com/minio/minio/pull/15433

fh973 · 2024-02-23T08:09:53 1708675793

The Quobyte DCFS does end-to-end CRC32 for each 4k block of data. All metadata and communication is also CRC protected, although one other frame boundaries.

EGreg · 2024-02-23T09:51:31 1708681891

IPFS, MaidSAFE and even BitTorrent have had this from day 1, and they are distributed and not under the control of one company.

FileCoin is used to pay for storage, but for now it seems storage is mostly free because of block rewards.

kfrzcode · 2024-02-23T12:19:48 1708690788

And they are slow, expensive and subject to node-based 33% attacks

Hedera Hashgraph is none of these

rakoo · 2024-02-23T13:48:58 1708696138

Bittorrent is free, as fast as my network collection allows, and verifies integrity. I disagree that it falls into your description.

kfrzcode · 2024-02-24T23:30:22 1708817422

You're right, I was flippant with my descriptors.

Geisterde · 2024-02-23T13:39:41 1708695581

Is bittorrent subject to sybil too?

wizzwizz4 · 2024-02-23T14:08:34 1708697314

All distributed networks are vulnerable to Sybil attacks (unless you can ensure provenance somehow, out-of-band), but unless you can break the hash function, all that gets you with BitTorrent is denial of service (and traffic interception, I suppose, but that should already be part of your threat model).

Geisterde · 2024-02-23T15:17:39 1708701459

I see, so there isnt a risk of downloading malicious content, just that you might not get the file. Thanks.

wizzwizz4 · 2024-02-23T16:14:35 1708704875

There is absolutely a risk of downloading malicious data. But the protocol will reject it for failing integrity checks. That doesn't mean the software will reject it.

ndriscoll · 2024-02-23T16:25:33 1708705533

Are you saying mainstream torrent clients don't check the hash? As far as I know, not only do they, but they ban peers who have sent them wrong data more than once. So you could DoS them for a bit with lots of peers sending bad data, but you need a lot of ips to do that because you'll quickly get all of them banned. And unless you are doing this through residential proxies, people will learn your ranges and block you by default.

Maybe there's a DoS you could do with uTP by spoofing someone else's IP and getting them to ban a real peer, but you'd presumably have to get in between them requesting blocks and reply with bad ones, which realistically means you are a MitM, so you could DoS them more directly by just dropping their traffic.

Or if you mean more generally that a malicious packet could reach a client and exploit a memory bug or something, that applies to literally anything on a network.

wizzwizz4 · 2024-02-23T22:29:14 1708727354

Suppose you have a torrent client that saves chunks to the filesystem before performing integrity checks. Suppose also that you have an antivirus program that scans every newly-created file for malware… and someone sent you 42.zip. Sure, the torrent client will reject it later, but the damage has already been done.

This specific scenario is unlikely (most antivirus programs can cope with zip bombs, these days), but computers are complex. Other edge-cases might exist. Torrenting is safer than downloading something from your average modern website, but in practice it's nowhere as safe as the theoretical limit.

jorticka · 2024-02-23T08:02:34 1708675354

RocksDB supports hashing at multiple levels (key, value, files) because Meta also realized the importance of integrity. It also supports verifying them in bulk.

Presumably filesystems built over rocksdb also support this.

KaiserPro · 2024-02-23T10:11:40 1708683100

The next bit:

> Its metadata engine uses an all-in-memory approach and achieves remarkable memory optimization

Sounds like a fun thing to recover after a crash.

ted_dunning · 2024-02-23T22:21:07 1708726867

They have a blog on that:

https://juicefs.com/en/blog/engineering/increase-performance...

daviesliu · 2024-02-25T10:55:43 1708858543

This post said another topic: how to backup the metadata of JuiceFS in readable format (JSON) and restore it into an empty database.

daviesliu · 2024-02-25T10:53:36 1708858416

The recover process is similar to what a database does after a crash, it loads the recent snapshot in disk and apply any newer transaction logs.

snissn · 2024-02-23T05:12:04 1708665124

do you use the hash of the items you're storing as the key/filename?

kfrzcode · 2024-02-23T12:19:14 1708690754

I think you could achieve this on Hedera's hashgraph - ABFT DLT