A few factual inaccuracies in here that don't affect the general thrust. For exa...

enether · 2025-09-25T10:51:04 1758797464

See timestamp 42:20 at https://youtu.be/NXehLy7IiPM?si=QQEOMCt7kOBTMaGK

The way it’s worded makes me understand that’s what scheme they’re using. Curious to hear what you know

UltraSane · 2025-09-24T22:21:05 1758752465

VAST data uses 146+4

https://www.vastdata.com/whitepaper/#similarity-reduction-in...

bombela · 2025-09-24T22:30:52 1758753052

page loads then quicky move up some video loads, and content is gone

dsnr · 2025-09-25T02:16:53 1758766613

Web 3 in a nutshell

UltraSane · 2025-09-25T02:24:21 1758767061

Wow that is very annoying. Here is a better page

https://www.vastdata.com/blog/introducing-rack-scale-resilie...

zeroimpl · 2025-09-25T13:18:07 1758806287

Naively it seems difficult to decrease the ratio of 1.8x while simultaneously increasing availability. The less duplication, the greater risk of data loss if an AZ goes down? (I thought AWS promises you have a complete independent copy in all 3 AZs though?)

To me though the idea that to read like a single 16MB chunk you need to actually read like 4MB of data from 5 different hard drives and that this is faster is baffling.

gregates · 2025-09-25T16:08:31 1758816511

Availability zones are not durability zones. S3 aims for objects to still be available with one AZ down, but not more than that. That does actually impose a constraint on the ratio relative to the number of AZs you shard across.

If we assume 3 AZs, then you lose 1/3 of shards when an AZ goes down. You could do at most 6:9, which is a 1.5 byte ratio. But that's unacceptable, because you know you will temporarily lose shards to HDD failure, and this scheme doesn't permit that in the AZ down scenario. So 1.5 is our limit.

To lower the ratio from 1.8, it's necessary to increase the denominator (the number of shards necessary to reconstruct the object). This is not possible while preserving availability guarantees with just 9 shards.

Note that Cloudflare's R2 makes no such guarantees, and so does achieve a more favorable cost with their erasure coding scheme.

Note also that if you increase the number of shards, it becomes possible to change the ratio without sacrificing availability. Example: if we have 18 shards, we can chose 11:18, which gives us 1.61 physical bytes per logical byte. And it still takes 1 AZ + 2 shards to make an object unavailable.

You can extrapolate from there to develop other sharding schemes that would improve the ratio and improve availability!

Another key hidden assumption is that you don't worry about correlated shard loss except in the AZ down case. HDDs fail, but these are independent events. So you can bound the probability of simultaneous shard loss using the mean time to failure and the mean time to repair that your repair system achieves.