Hacker News new | past | comments | ask | show | jobs | submit login

I'm not a privacy defeatist, but I am a fingerprinting defeatist. Here's why: I don't think it's realistic for caching and anti-fingerprinting to co-exist, and given those two options users will always pick the former because the latter would be perceived as slow. The classic example is:

<script src=foobar.js>

Where foobar.js just returns something like "var id=0x1234567", the user who doesn't want to be fingerprinted cannot cache this script because it could be uniquely generated.




It would be simple enough to rely on resource hashes to not actually contact the sources of files and then to skip or approve seeking hashes you can't get from your preferred shared caches. Thus a unique file gets ignored by the most anonymity concious.

I think there is a chicken and egg here combined with IP incentives to not fix how web browsers work.

I hoped the great firewall would have accidentally fixed this by making it preferable to refer to hashes that can be found in peer caches irregardless of CDN status.


I think you're forgetting that resources cannot be shared if they're marked private. You cannot get account_statement.js from your shared cache, it has to be unique to you.

There is no easy fix to this.


Your example actually isn't incorrectly marked private.. So that is already covered in my previous comment.

The most anonymity concious would realize they are trying to do banking in what is supposed to be their anonymous session and never fetch the file.. if they somehow missed that they were entering auth details?

The way things need to work on the web involve choices that you apply differently (or IE applies for Windows users.) The defeatist response is not to implement any choices. A typical user will want a small number of PII sites, so let's have only PII mode and autofill their details into forms in any blog!


I'm not following your argument, do you agree that there are resources that cannot be saved in a shared cache?


There are resources that shouldn't be saved in a shared cache and shouldn't be seen in an anonymous session. It is not a coincidence that they are both and it is great that they are explicitly marked.


But, if the server sends you a private resource, this implies the server must already know your identity. Being private it's not supposed to send it to anyone else, so it needs to identify you in order to know "does this user have access to this resource?".

Am I missing something here?


Yes, private doesn't necessarily mean authenticated, it just means shouldn't be saved in a shared cache.

For example, "weather_at_your_geoip.js".


The script tag could have a mandatory file hash attribute.


Then you would need to know the (dynamically generated) contents in advance?


The server generates the content, so he knows the file hash in advance.


What's to prevent the server from sending you a unique hash?


How would he know that it's "you"?


You could cache files per requesting domain. That way repeat visit of the same site is fast, but nothing is leaked cross-domain.


That still allows same-site fingerprinting, though, and as long as there are ways to transmit data across origins that doesn't really suffice.


Why not cache it for the user, but go ahead and retrieve it in the background?


I don't understand the question, it's either cached or not.

If it's not cached, then it has to be fetched synchronously for anything depending on its value to work - so it's slow.

If it is cached, then the cached value is known.


> Why not cache it for the user, but go ahead and retrieve it in the background?

If the JS triggers another download, and the browser requests the second resource before the initial JS is done downloading...


The point is that other scripts on the page can find out that id=0x1234567, and use that value for tracking requests. Since you've cached it, they can track you across sessions.


The idea is that if you visit two sites, and on both sites you use the same token, the ad-provider (or whoever) can associate you across the two origins because of the cached token.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: