How do they verify that a host can actually store all of the data?

Taek · on July 16, 2019

Ajedi32 is not correct.

Data is verified probabilistically on the Sia network. The blockchain has access to the Merkle root of the data that the host is supposed to be storing. The blockchain will request that the host provide a 64 byte segment of the data (chosen randomly) along with a Merkle proof that the data is part of the Merkle root.

If the host can provide the data and the proof, the host is rewarded as though they've demonstrated that they have all of the data. If the host cannot provide those 64 bytes along with a proof, the host is punished as though they are not storing any of the data.

brokenmachine · on July 17, 2019

How often are they queried, and what percentage of the payment is the punishment?

A bad host could only store, say, 1/3 of the data that they say they are, and play the odds that they are storing the requested random chunk.

b_tterc_p · on July 17, 2019

How does punishment work? What stops one bad actor from agreeing to collect infinite data from a variety of sources and tanking both the trust and profitability of data hosts?

Also what about bandwidth constraints on the host end

Taek · on July 17, 2019

When a host agrees to accept data, they put up out-of-pocket money. This makes it expensive for a bad actor to accept an infinite amount of data, as each piece of data requires more collateral to be put forward by the host.

Before a renter creates a contract with a host, the renter will perform some measurements on the host and determine if the host is suitable. A renter in China will chose different hosts than a renter in the US, because the latencies and throughputs of each host will be different.

b_tterc_p · on July 17, 2019

Is there somewhere I can read about the punishments in more detail? E.g. how often the quizzes are, what the penalty is for getting it wrong / not being available for the answer?

Ajedi32 · on July 16, 2019

If I recall correctly, the client pre-computes hashes of random fragments of each file then "quizzes" the hosts on that information periodically. If they're not actually storing a copy of the data, they won't be able to compute the resulting hash. (I may be misremembering some of the details, but I believe that's how it works in principle.)

b_tterc_p · on July 19, 2019

What if the client lies about the pre computed hash quizzes?