Hacker News new | past | comments | ask | show | jobs | submit login

forgive me if I'm naive, but can file hashes be spoofed in any way? I'm thinking upload a bunch of files that hash to random numbers, then download the de-duplicated original files.

could someone more knowledgable in this area tell me if this is a credible threat?




You're looking for two files f1 and f2 that hash to the same thing? And hoping that f1 is randomly generated by you, but f2 is someone else's original file which can be downloaded because of dedup?

In general, solving for f1 and f2 (i.e., you get to control both) such that h(f1)=h(f2) is called finding collisions in hash functions and hash functions like SHA are considered collision-resistant. It is very difficult to find collisions. http://en.wikipedia.org/wiki/Collision_resistance

Of course, if f1=f2, then you've magically guessed the original file, so that is highly unlikely. What you're asking for is something stronger than collision resistance; you're asking for a second preimage, given a first. This is for all practical purposes (requires at least 2^120 or more computations) impossible for any well designed hash function.


and is the hashing done by the client, or server-side? because client-side would make spoofing even easier.


Client side. If you upload a very popular 500+mb file, maybe try a popular linux distribution iso, it will sync instantly.


so has someone written a client where you just enter hashes of popular files you are interested in and get them snyced to your dropbox?


As far as I know, not yet. It's not even publicly known how exactly the deduplication API works.

For example: is the hash enough to "prove" you have a file? Or do you need the file size (and potentially other properties, such as the first block) as well?

The only way to find this out would be to look at the protocol that the proprietary client uses.


10 years ago, europeonline launched a service where your traffic downlink would be routed through a satelite connection.

They also had a service that let you predownload files from http/ftp servers and then you could request an offline broadcast from their servers to your home pc.

Instead of refetching every file again and again, they also optimized by only checking the file size + file name. So, someone came along and created a fake ftp server script, which just replied with a listing of the things you wanted to download, and they there instantly added to your account. You only had to know the filename + filesize.


Spoofing is the easy thing, as the hashing is done in the client. You'd just have to figure out the protocol to get the file with a known hash (and other metadata, probably).

However, "guessing" the hash for a file that you don't have is not. The chance that you'll get a file by trying random hashes is very very very small.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: