Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> as ISPs wouldn't just dish out private info like that without a warrant

ISPs regularly sell private data to the highest bidder. Similarly with payroll providers and whatnot (a non-trivial fraction of my paystubs -- not just salary, but withholding, exempt tax status, ... -- are available to anyone with a few dollars; historically, it _seems_ like the only buyers have been employers trying to see if their salary offers aligned with my expectations).



They'll sell anonymized data in aggregate but no, you can't just go to an ISP and buy the user behind an IP without a court order.


They sell "anonymized" data, not just "aggregated". The only missing link is tying that back to a real person (i.e., they haven't solved differential privacy; they've just given the illusion that they're not selling personal data). Tying it back to a real person is easy though because the non-anonymized fields (age, gender, salary, zip-code, ...) are uniquely identifying for most individuals and are available for sale tied back to a real human from other sources which you can fuzzily join the ISP data into.

It's similar to how bitcoin transactions (before mixers and whatnot) were de-anonymized. You have the secret information (an identity), the public information (transaction history), and you're able to fuzzily join that public information to other public sources containing the secret information to also have secret information tied with the original "anonymous" source.


Just to re-emphasize, because I think it's really poorly understood: most "anonymized" data is a few additional data points away from being re-identified.

Data re-identification was already happening in 2006 (just one example below). And now there's exponentially more data available to use for this purpose.

>We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world’s largest online movie rental service. We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber’s record in the dataset. Using the Internet Movie Database as the source of background knowledge, we successfully identified the Netflix records of known users, uncovering their apparent political preferences and other potentially sensitive information.

[1] https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf


Yes you can, companies have deals with ISPs for individual real-time mobile and home browsing data. If you pay enough it has real names, otherwise it has person id and household id along with other data that makes it easy to associate with the real person or household.


American ISPs injected tracking codes into their user’s HTTP traffic so they could get paid by advertisers. I would not speak in absolutes about that, especially because anonymizing data is a hard problem which even we’ll-intended people have made mistakes with.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: