Against privacy defeatism: why browsers can still stop fingerprinting

fpgaminer · on June 29, 2018

Over the years I've noticed this, privacy defeatism, amongst other excuses used to push back against security and privacy enhancements.

Another is security chicken and egg. "Because A isn't secure, there's no point in securing B. And because B isn't secure, there's no point in securing A." I've seen arguments along those lines used against improving the security of, for example, DNS and SNI. Or just DNS in general. "There's no point in securing DNS because your ISP can see what IP addresses you connect to."

The all-or-nothing argument has been used to argue against HTTPS. "State actors like China can circumvent HTTPS, since they control trusted CAs! So why use HTTPS at all?"

Then there's the arguments against security because "it makes development harder!" People have argued against mandatory HTTPS saying that it makes developing websites harder.

The list goes on.

Luckily it seems that pro-security and pro-privacy succeed in the end, for the most part. But those that use these pathological arguments have certainly delayed things more than they should have been.

neverfone · on June 29, 2018

I'm not a privacy defeatist, but I am a fingerprinting defeatist. Here's why: I don't think it's realistic for caching and anti-fingerprinting to co-exist, and given those two options users will always pick the former because the latter would be perceived as slow. The classic example is:

Where foobar.js just returns something like "var id=0x1234567", the user who doesn't want to be fingerprinted cannot cache this script because it could be uniquely generated.

yayana · on June 29, 2018

It would be simple enough to rely on resource hashes to not actually contact the sources of files and then to skip or approve seeking hashes you can't get from your preferred shared caches. Thus a unique file gets ignored by the most anonymity concious.

I think there is a chicken and egg here combined with IP incentives to not fix how web browsers work.

I hoped the great firewall would have accidentally fixed this by making it preferable to refer to hashes that can be found in peer caches irregardless of CDN status.

neverfone · on June 29, 2018

I think you're forgetting that resources cannot be shared if they're marked private. You cannot get account_statement.js from your shared cache, it has to be unique to you.

There is no easy fix to this.

yayana · on June 30, 2018

Your example actually isn't incorrectly marked private.. So that is already covered in my previous comment.

The most anonymity concious would realize they are trying to do banking in what is supposed to be their anonymous session and never fetch the file.. if they somehow missed that they were entering auth details?

The way things need to work on the web involve choices that you apply differently (or IE applies for Windows users.) The defeatist response is not to implement any choices. A typical user will want a small number of PII sites, so let's have only PII mode and autofill their details into forms in any blog!

neverfone · on July 1, 2018

I'm not following your argument, do you agree that there are resources that cannot be saved in a shared cache?

yayana · on July 1, 2018

There are resources that shouldn't be saved in a shared cache and shouldn't be seen in an anonymous session. It is not a coincidence that they are both and it is great that they are explicitly marked.

tripzilch · on July 1, 2018

But, if the server sends you a private resource, this implies the server must already know your identity. Being private it's not supposed to send it to anyone else, so it needs to identify you in order to know "does this user have access to this resource?".

Am I missing something here?

neverfone · on July 1, 2018

Yes, private doesn't necessarily mean authenticated, it just means shouldn't be saved in a shared cache.

For example, "weather_at_your_geoip.js".

jhasse · on June 30, 2018

The script tag could have a mandatory file hash attribute.

neverfone · on July 1, 2018

Then you would need to know the (dynamically generated) contents in advance?

jhasse · on July 1, 2018

The server generates the content, so he knows the file hash in advance.

path411 · on July 2, 2018

What's to prevent the server from sending you a unique hash?

jhasse · on July 2, 2018

How would he know that it's "you"?

DenisM · on June 30, 2018

You could cache files per requesting domain. That way repeat visit of the same site is fast, but nothing is leaked cross-domain.

gsnedders · on June 30, 2018

That still allows same-site fingerprinting, though, and as long as there are ways to transmit data across origins that doesn't really suffice.

DennisP · on June 29, 2018

Why not cache it for the user, but go ahead and retrieve it in the background?

neverfone · on June 29, 2018

I don't understand the question, it's either cached or not.

If it's not cached, then it has to be fetched synchronously for anything depending on its value to work - so it's slow.

If it is cached, then the cached value is known.

notriddle · on June 30, 2018

> Why not cache it for the user, but go ahead and retrieve it in the background?

If the JS triggers another download, and the browser requests the second resource before the initial JS is done downloading...

jessaustin · on June 29, 2018

The point is that other scripts on the page can find out that id=0x1234567, and use that value for tracking requests. Since you've cached it, they can track you across sessions.

endianswap · on June 29, 2018

The idea is that if you visit two sites, and on both sites you use the same token, the ad-provider (or whoever) can associate you across the two origins because of the cached token.

blakesterz · on June 29, 2018

I've been thinking that the "State actors like..." argument is rather weak because it seems like the more we learn about "State actors" from anywhere the more it seems like they can get around anything. If they happen be be focused on your or your organization it seems like it's just game over no matter what you try to do.

Luckily, I guess, securing yourself against state level actors keeps you safe from everyone else that's trying to do bad things, so, win/win?

kbenson · on June 29, 2018

A "state actor" focusing on you should be considered at least as dangerous (and semi-impossible) to guard against as a dedicated adversary that wants to infiltrate you specifically. Just as with a dedicated adversary, it's no justification not to make it hard for the average or law-abiding citizen and what they can easily come up with.

We don't have locks on our front doors to make it impossible for unwanted people to get inside our houses, we do it to prevent the average person, deter the interested criminal, and hopefully give us warning when it is or has happened.

> Luckily, I guess, securing yourself against state level actors

To be clear, there is no securing yourself against state level actors. You can harden yourself, and reduce your target area and/or footprint, but there's no way to actually make yourself secure.

mirimir · on June 29, 2018

> To be clear, there is no securing yourself against state level actors. You can harden yourself, and reduce your target area and/or footprint, but there's no way to actually make yourself secure.

Yes. Your best bet is hiding. Once you're identified, it's basically over. It's just another facet of the state monopoly on force.

apk17 · on June 30, 2018

The force monopoly has little to do with it - few institutions would have the resources to win against the US govt.

mirimir · on July 1, 2018

Resource monopoly is rather a subset of force monopoly, no?

I mean, consider guns. In the US, citizens can own guns. And armed private security is OK, too. But there's consistent push-back against private militias, armies, etc. Unless they work only outside the country, in acceptable contexts.

Santosh83 · on June 30, 2018

> To be clear, there is no securing yourself against state level actors. You can harden yourself, and reduce your target area and/or footprint, but there's no way to actually make yourself secure.

You can, if you gain the support of an opposing, comparable state actor. That's what Snowden has employed. But 100% security is still a myth, not only against state actors... reality itself is just not like that. Unless you can hide inside a black hole.

apk17 · on June 30, 2018

Re state actor: I guess secure-or-known-to-have-failed is attainable on a personal scope. Securing a company, no matter what size, is futile.

rqs · on June 30, 2018

Irony. HTTPS deployment in China is going very ... not bad.

First and second tier websites already switched it on long ago. Because in China we had an ancient tradition for ISPs to sniff and modify user's traffic to (for example) inject their own affiliation codes to the web request. Those websites hates it of course.

forkerenok · on June 30, 2018

> Irony. HTTPS deployment in China is going very ... not bad.

Given Chinese govt has been controlling some CAs, China vs. HTTPS irony is.. well, less of an irony :)

rqs · on June 30, 2018

On the "CAs is controlled by state" side, some privacy enthusiasts in China already removed those CAs from their system long ago. Go give them some hug and support: https://github.com/chengr28/RevokeChinaCerts

And of course, in the context of CA, what's truly important is transparency in the audit process and continuous monitoring. Catch them red handed will end their business[0] once and for all. And so far, most of them are rule-abiding.

[0] https://blog.mozilla.org/security/2016/10/24/distrusting-new...

gsnedders · on June 30, 2018

And it's a question of when, and not if, we require CT logs for every certificate the browser sees.

taneq · on June 30, 2018

Another example of how "untrusted by default" results in a better system.

cm2187 · on June 29, 2018

Absolutely. Heard that with stateless IPv6 DHCP. "If you can fingerprint the browser, why does it matter that one can use the MAC component of the IPv6 to track people across networks?". Heard on HN...

oconnor663 · on June 29, 2018

> "There's no point in securing DNS because your ISP can see what IP addresses you connect to."

I don't see the flaw in that one. It does seem kind of extra to secure DNS. What's the argument in favor?

> "State actors like China can circumvent HTTPS, since they control trusted CAs! So why use HTTPS at all?"

This one I agree is wrong, but not for something-is-better-than-nothing reasons. I think it's wrong because when Chinese CA's abuse their position they get blacklisted: https://arstechnica.com/information-technology/2015/04/googl...

cpeterso · on June 29, 2018

> I don't see the flaw in that one. It does seem kind of extra to secure DNS. What's the argument in favor?

If ISPs control your DNS, then they can block domains they don't want you to reach or redirect unknown domains to their own ad pages like Comcast has in the past. If you use someone else's DNS, your ISP could still block network packets sent to certain IP addresses, but they wouldn't want to block all AWS IP addresses just to block one site on AWS.

> when Chinese CA's abuse their position they get blacklisted [in Chrome]

AFAIK, Chrome has very small market share in China. Most Chinese users use chimeric browsers that combine Chromium and IE's Trident engines. Many Chinese bank and government sites require NPAPI plugins for custom crypto, so the popular browsers can seamlessly switch a tab from Chromium to Trident for sites that are known to require plugins.

ummonk · on June 29, 2018

>If you use someone else's DNS, your ISP could still block network packets sent to certain IP addresses, but they wouldn't want to block all AWS IP addresses just to block one site on AWS.

Well, that's sort of what Russia did to try to ban Telegram...

danShumway · on June 30, 2018

Russia may decide they're OK with blocking AWS. Comcast probably isn't going to do that. Is it really essential that we solve the Russia problem before we solve the Comcast problem?

Even on the Russia side of things there are advantages.

If you use your ISP's DNS, they can block a specific site or service with very high granularity and zero cost. If you don't, they can still block things, but they have to resort to cruder methods that risk annoying customers/citizens and degrading their overall network health.

Over time these extra annoyances might become a kind of death by 1000 cuts, allowing ISPs and countries that don't censor to outperform them in ways that even ordinary, privacy/freedom ambivalent people care about.

This is all quite proper. It's healthy for censorship (even when it's possible) to have costs. Where possible, we should try to build technological infrastructure so that networks that censor won't be able to compete on an even ground with the networks that don't.

mirimir · on June 29, 2018

Well, there isn't much benefit to securing DNS or using third-party name servers, unless you also use VPN services and/or Tor to prevent your ISP from seeing what sites you access. However, securing DNS can somewhat interfere with geolocation by websites. Given mechanisms that point to the nearest servers.

But if you are using VPNs and/or Tor, it's crucial to secure DNS. Tor exits do DNS lookups for clients. And properly configured VPN services do as well.

87 · on June 29, 2018

An IP is pretty useless nowadays with so many sites behind cloudflare.

mirimir · on June 30, 2018

Even so, can't ISPs still see what sites you connect to? Or is everything after the inital connection to Cloudflare hidden in HTTPS? And how many sites are typically reachable through a given Cloudflare IPv4 address?

tialaramex · on June 30, 2018

In TLS 1.2 (what a good browser and site uses today) the Server Name Indication sent from your client is plaintext, as is the certificate sent back by the server and the choices made by both sides during key agreement.

In TLS 1.3 (not yet officially published as a standard but basically finished and already drafts are used by Firefox, Chrome, and Cloudflare) only SNI remains plaintext.

So yes, the FQDN you're connecting to will still be revealed, but an adversary can't trust it, unlike the certificate itself, you could be lying if the remote server lets you.

If you scroll back a few days HN discussed the Internet Draft about SNI encryption (it's a problem statement rather than a proposed solution) so they want it, it's just not clear how it could be done (there are lots of bad / ineffective options)

textmode · on June 30, 2018

"There's no point in securing DNS because your ISP can see what IP addresses you connect to."

Perhaps the ISP has fingerprinted the pages of every website on every shared IP address so they can easily determine which website the user is visiting? Why rely on unencrypted DNS or unencrypted SNI when it is so trivial to map IP addresses to domains.

A poor example, but consider this single IP address 216.239.36.21 with a reverse lookup returning a 23M list of 1,283,151 domains.

https://api.hackertarget.com/reverseiplookup/?q=216.239.36.2...

walterbell · on June 29, 2018

Speaking of mining, here’s a thread on cryptocurrency ASIC resistance defeatism, similar pattern?

https://www.reddit.com/r/CryptoCurrency/comments/7fuh2l/why_...

sametmax · on July 1, 2018

The biggest barrier for privacy is not a technical one. It's that most people don't care. We see outrages in a few intellectual or tech saavy niches, like hn, but the average joe has many, many, other priorities before this.

deleted_account · on June 30, 2018

I see a parallel between this and digital rights management. Eliminating probabilistic fingerprinting for browsers is always going to be a technical arms race.

pmoriarty · on June 29, 2018

"your ISP can see what IP addresses you connect to"

Why not make some browser extensions that randomly connect to all sorts of websites and download random web pages?

There's plenty of bandwidth for us to do that these days, and your real browsing habits should be able to hide in the noise.

Of course, for something more sophisticated, the "random" browsing should statistically match real web browsing, to make traffic analysis more difficult.

notriddle · on June 30, 2018

No. [The Tor people did the math on that][1]. It would probably spell the end of "unlimited home internet at X Mb/s" if it got widely used.

[1]: https://www.torproject.org/docs/faq.html.en#SendPadding

extralego · on June 29, 2018

It’s also the extent of the anarchy-libertarian stance popular on HN.

JumpCrisscross · on June 29, 2018

> the anarchy-libertarian stance popular on HN

What’s popular in the tech community is some combination of laziness and naive political cynicism masking as strategy or anarcho-libertarianism.

throwaway37585 · on June 29, 2018

> laziness and naive political cynicism

What are you referring to?

paganel · on June 30, 2018

I’m a privacy defeatist because the vast majority of people who argue for things like “https everywhere” and secure DNS nevertheless still use CC payments and buy most of their things online, which practically defeats the whole purpose: your CC-processor or bank now know when and if you’re going to have sex (you bought some condoms with your CC), know your political and ideological affiliations (based on the books you’ve purchased online) etc etc. It’s all pretty much a farse, I’d call it “voodoo privacy”.

forkerenok · on June 30, 2018

That's ironic that you responded with the points, exact spirit of which the parent rebukes.

The issues you are raising are indeed real issues, that will be eventually addressed. And that "eventually" largely depends on the pressure that privacy-minded consumers create.

And just because they are not addressed yet, it doesn't mean we shouldn't care about fixing other weak links.

paganel · on June 30, 2018

It’s not ironic at all, I’ve read about this discourse about data privacy for at least 10 years, nothing has changed for the better since then, quite the contrary. I for one take active steps in combating it (I very rarely use my CC for purchases, I buy almost nothing online etc), it’s tiring to be lectured on privacy by people who act one way and say a totally different thing.

dkersten · on June 30, 2018

Just because one person or entity (the CC company in this case) has all my data doesn’t mean every other person or entity should have it too. The fewer that have your data the better and the more it becomes possible to slowly improve your privacy level.

Es, my CC company can infer a lot about me. Yes, they may even be selling that information to others. That doesn’t mean that I want anybody who can MITM my requests to be able to see what I’m doing (or, worse, send me forged content for whatever purpose).

godzillabrennus · on June 30, 2018

Merchants don’t typically pass data about items being purchased to merchant providers.

Merchants rely on cc info to track consumer behavior.

Projects like ShopIn.com give me hope that blockchain may decentralize this data and put control back in consumers hands.

If they decide that’s important to them.

hartator · on June 29, 2018

Chrome has no interest in doing that.

Google uses browser fingerprinting for their ads, and to make sure a real user is requesting their services.

I would trust Safari a lot more to implement something like this.

Drdrdrq · on June 29, 2018

Which is a big reason not to use Chrome. Their incentives are not aligned with users'.

AngryData · on June 29, 2018

That is a big reason I stopped using chrome. I block all the cookies and scripts I can but they were still collecting data from other shit and serving me obvious directed ads.

gsnedders · on June 30, 2018

I don't know about Google, but AIUI FB does targeted advertising based on source IPs if there's a very small number of users from a given IP.

As long as they know the source IP, they can implement some form of tracking.

mikedilger · on June 29, 2018

There are two ways to defeat fingerprinting: 1) Be common: have your data match what others are sending 2) Be unique on every page request: scramble your data

Browsers in category (2) are not trackable, even though they appear to be fingerprintable. Each page request is a different fingerprint.

What is unclear from the article is whether the studies mentioned (EFF, INRIA) considered that and tested that. Does anyone know? Because privacy does not require non-unique fingerprints, it just requires untrackability.

incompatible · on June 30, 2018

Are there any readily-available browser plugins (or whatever) that implement (2)?

beagle3 · on June 30, 2018

Firefox has two about:Config settings, one stops canvas fingerprinting, the other notifies.

Other settings stop giving out a list of plugins or fonts. And e.g. unlock or umatriz can rotate your user agent.

It’s not perfect, but with a few tweaks Firefox is much much harder to fingerprint - though IPv6 tends to undo all of that because many providers assign a prefix per customer which will never change and can be used by anyone to correlate.

gsnedders · on June 30, 2018

How common is the setup those settings put you in, though? Are you more or less unique in that bin?

propogandist · on July 1, 2018

this is close https://multiloginapp.com/canvasdefender-browser-extension/

kodablah · on June 29, 2018

> why browsers can still stop fingerprinting (emphasis mine)

> privacy defenses don’t need to be perfect. Many researchers and engineers think about privacy in all-or-nothing terms

So browsers can't stop fingerprinting but they can reduce it, and the article author's title falls in the same all-or-nothing trap that they criticize others for. Too often practicality is viewed as defeatism. I don't remember defeatism, I remember practical limitations to stopping all uniqueness vectors but realization that some can be stopped.

incompatible · on June 30, 2018

I remember somebody who worked with Firefox saying that even the idea of not by default revealing the OS and hardware platform received too much opposition.

csours · on June 29, 2018

To be clear: Yes we should fight fingerprinting...

But: I don't care if only 5% of internet users are uniquely identifiable. I care if I am uniquely identifiable.

Iolaum · on June 29, 2018

If you are not uniquely identifiable but 99% of the users are that's enough to identify you. That's why US Navy funded the tor project. If their people were the only ones that could hide from other governments then those other governments knew who those unidentified people were.

meowface · on June 29, 2018

This is also a big issue with NoScript and similar plugins. A very small percentage of real users (ignoring bots and headless browsers and such) intentionally disable JavaScript, so they're painting a target on themselves while trying to do the opposite.

mikedilger · on June 29, 2018

This is true. However, most users of NoScript/uMatrix/uBlock are blocking the script that actually does the fingerprinting. So while the server could infer information because the script didn't run, usually they get no notification and tracking simply fails.

progval · on June 29, 2018

For once, let's thank web developers who assume everyone has javascript enabled.

Drdrdrq · on June 29, 2018

Well, NoScript (and uMatrix) help with security too, so it might still be a good idea to use them.

meowface · on June 29, 2018

For sure. I use NoScript for browser security, not for fingerprint protection. I would still definitely recommend people use NoScript/uMatrix/etc. It's just an unfortunate side effect.

mhneu · on June 29, 2018

The article mentions that the biggest impact comes from limiting the highest-entropy information.

It wouldn't be hard to rank the various attributes by entropy and then you - individually - could calculate the probability that you'd be uniquely identifiable.

Seems like such work would be valuable.

bluGill · on June 29, 2018

As the % of users that are identifiable drops eventually the value isn't worth the cost. If only 5% is identifiable that is enough that it won't be done.

Note that identifiable comes is degrees. If they can't tell me from my wife that doesn't matter for most purposes.

guelo · on June 29, 2018

Dropping the average number means that trackers make less money which means that they have less resources to invest in tracking which means that there's less of a chance that you specifically will be tracked.

TekMol · on June 30, 2018

The whole article seems to be based on this study:

https://www.doc.ic.ac.uk/~maffeis/331/EffectivenessOfFingerp...

And sees it as an indicator that preventing fingerprinting is possible:

    Only a third of users had unique fingerprints,
    despite the researchers’ use of a comprehensive
    set of 17 fingerprinting attributes.

To me, the 17 attributes do not seem comprehensive at all. For example they don't make use of the users IP. So much can be derived from the IP. Carrier, approximate location etc. They also don't use the local IP, which is leaked via WebRTC:

https://browserleaks.com/webrtc

They also don't seem to measure the performance of CPU, RAM and GPU when performing different tasks.

But yes: Browsers should do more to prevent fingerprinting. But it seems they have no inclination to do so. That they don't plug the WebRTC hole that leaks the local IP is a strong indicator for me that privacy is low on the list of the browser vendors. Or maybe not on the list at all.

mirimir · on June 29, 2018

There's generally a tradeoff between usability and performance, and resistance to fingerprinting. If your browser has WebGL enabled, the machine (not just the browser) can be fingerprinted. If it caches resources, adversaries can discover browsing history.

amelius · on June 29, 2018

> If your browser has WebGL enabled, the machine (not just the browser) can be fingerprinted.

Curious, how would that work exactly? (I've never used WebGL yet in any code).

> If it caches resources, adversaries can discover browsing history.

Same question. I was under the impression that cached resources can be accessed only by the domain that introduces them.

mirimir · on June 30, 2018

Re WebGL, I gather that canvas fingerprints are based on graphics hardware and drivers. So all browsers on a given machine have the same canvas fingerprint. I've used https://browserleaks.com/webgl for testing. And with VMs, it's even worse. I found that all Debian-family VMs on a given host have the same canvas fingerprint. But Windows, macOS, CentOS, Arch and TrueOS (aka PC-BSD) VMs each have distinct fingerprints.

About cached resources, it's my impression that adversaries can exploit XSS vulnerabilities for detection. Most simply, you just measure load time.

yjftsjthsd-h · on June 30, 2018

If drivers are involved, then it would follow that updating changes your fingerprint? Still bad, but mitigated by being time-fenced.

mirimir · on June 30, 2018

I don't know. In my testing, I don't recall that I even used related Debian and Ubuntu releases. So I doubt that just updating the graphics driver would change the fingerprint.

However, I was using VirtualBox VMs, so it's possible that my results were artifacts caused by restricted choice in virtual graphics drivers. Testing that would be rather tedious, and I'd appreciate correction.

gsnedders · on June 30, 2018

Sometimes but not always, yes; it depends on what changes in the driver.

deleted_account · on June 30, 2018

This paper discusses the details:

http://yinzhicao.org/TrackingFree/crossbrowsertracking_NDSS1...

You can also visit their site to see it in action: http://uniquemachine.org/

mirimir · on June 30, 2018

And re the demo:

> Your graphics card does not seem to support WebGL.

So NoScript is blocking WebGL, as promised.

mirimir · on June 30, 2018

Thanks, I'd forgotten that paper.

xg15 · on June 30, 2018

The irony seems to be this:

> But there’s another pill that’s harder to swallow: the recent study was able to test users in the wild only because the researchers didn’t ask or notify the users. [2] With Internet experiments, there is a tension between traditional informed consent and validity of findings, and we need new ethical norms to resolve this.

So the result of their privacy-advocating study was itself only obtainable by breaching the privacy of their participants. (I.e. making them participants without them knowing or consenting to it)

ape4 · on June 29, 2018

The article mentions Canvas, Battery, Audio,and WebRTC. Should these all request a user OK before being used?

dawnerd · on June 29, 2018

Theres a couple blocklists you can get for a pihole that blocks sites that knowingly use those to track. It's not perfect but it's something.

https://raw.githubusercontent.com/CHEF-KOCH/WebRTC-tracking/...

https://raw.githubusercontent.com/CHEF-KOCH/Canvas-Font-Fing...

https://raw.githubusercontent.com/CHEF-KOCH/Audio-fingerprin...

https://raw.githubusercontent.com/CHEF-KOCH/Canvas-fingerpri...

alkonaut · on June 29, 2018

Definitely for high entropy APIs at least. Checking if audio or canvas exists is just two bits.

But e.g. being able to read canvas pixels is absolutely insane that it’s allowed.

So if just these sources of huge entropy were default swithed off, as the article says, fingerprinting would be a lot harder.

floatboth · on June 29, 2018

Firefox with Tracking Protection (on by default in private windows I think) asks permission for reading canvas pixels.

https://bugzilla.mozilla.org/show_bug.cgi?id=967895

arendtio · on June 30, 2018

Especially, canvas would become useless in an opt-in scenario and would just foster permission fatigue (people just clicking "Allow" because it makes those dialogs go away).

Another option might be to define closely (pixel-by-pixel) how a canvas should look like after a specific action. That way vendors would have less room for 'their way' of drawing things but the result would look equal and would be useless for fingerprinting.

Similarly, one could define a list of fonts which every browser should bring and all other fonts should be loaded from a server. It would eliminate the 'you have special fonts installed' problem completely.

alkonaut · on June 30, 2018

The canvas wouldn’t be opt in, only the pixel reading feature would be - and that’s an edge case that I can’t imagine being used in even one in a thousand uses of Canvas. Just drawing to a canvas has no privacy implications and covers almost all uses.

daxterspeed · on June 30, 2018

There's many amazing things only possible in canvas because you can read the pixels directly (especially in displaying complex pixel art), so removing that capability out-right would be highly disappointing.

I suppose a better solution would be to mark the canvas as tainted if any text is drawn onto it, so that pixel reading capabilities are blocked. Another aggressive solution could be to redraw the entire canvas without gpu acceleration if pixel data is requested. Together these would severely limit canvas fingerprinting.

alkonaut · on June 30, 2018

Or just block it and display a warning that the user can accept before the operation is allowed. And as you say: all canvas impls should come in 2 flavors, one “reference” software impl that guarantees the exact same pixels given the exact same operations. Using any other impl should block readpixel or redraw with the reference impl before reading.

I realize there are edge use cases that need reading a pixel I just don’t agree with the idea that they would more important than limiting tracking. So I think the canvas limiting mode (used in ff safe mode) should be default.

textmode · on June 29, 2018

"Another lesson is that privacy defenses don't need to be perfect. Many researchers and engineers think about privacy in all-or-nothing terms: a single mistake can be devastating, and if a defense won't be perfect, we shouldn't deploy it at all."

This "all-or-nothing" perspective is rampant on www forums discussing computer topics and certainly HN is no exception. It is particularly acute in any discussions of "privacy" or "security".

There are countless examples.

Earlier this week the topic of SNI rose again to HN's front page.

A minor percentage1 of TLS-enabled websites require SNI. An unfortunate side effect of SNI is that it makes it easier for third parties to observe which websites users are accessing via TLS because it sends domainnames unencrypted in the first packet.

Forum commenters will thus argue because there are other, more difficult means for some third parties to observe these domainnames, e.g., through traffic analysis, that the unencrypted SNI is therefore not an issue worth addressing.

All-or-nothing. If the privacy achieved by some proactive measure is not "perfect" then to these commenters it is worthless.

But the HN front page reference suggested otherwise: It was an RFC describing how the IETF is taking a proactive measure, trying to "fix" SNI, encrypting it to prevent third parties from using it in ways detrimental to users.

There is an easier proactive measure. The popular browsers send SNI by default, even if the website does not require it. The default behaviour is to accomodate a minority of TLS-enabled websites at the expense of all users, including those who may not be using this minority of websites.

To make an analogy to fingerprinting, imagine sending 17 unique identifiers with every HTTP transaction when, say, only 5 are actually needed. The all-or-nothing perspective adopted by forum commenters would dictate that it makes no sense to reduce the number unless the number can be reduced to zero.

Amongst the security folks there is a concept sometimes called "defense in depth". Commenters in discussions about security often agree there is no such thing as "perfect" security and they cannot rely on a single, "silver bullet". They must use multiple tactics.

Is privacy somehow different? There are many tactics users can take that, cumulatively, can make things more difficult for the data collectors.

1 Survey of websites currently appearing on HN

Number of unique urls: 367

Number of http urls: 43

Number of https urls: 324

Number of https urls requiring SNI: 38

Number of https urls requiring correct SNI: 26

"Requiring correct SNI" means SNI must match Host header.

Summary

One can fetch 286 of the 324 https urls currently posted on HN with a HTTP client that does not send SNI.

An additional 12 can be retrieved by sending a decoy SNI name that does not match the Host header.

gasull · on June 30, 2018

There is Privacy Possum, a Firefox and Chrome extension that works better against fingerprinting than Privacy Badger:

https://github.com/cowlicks/privacypossum

aendruk · on June 30, 2018

How does it differ from Privacy Badger? (In goals, threat model, implementation, etc.?) I found the readme to be a little incomprehensible.

auslander · on June 30, 2018

It starts saying all the right things, I was, like, expecting real tech stuff, after intro, and ... post ends.

Where is per browser stats, correlations, data points and features? Canvas vs IP vs User-agents?

ComodoHacker · on June 30, 2018

That's what hypertext is for. Follow the links.

raven11 · on June 30, 2018

What is needed is a community written and funded privacy browser. Mozilla the billion dollar non profit and Firefox their so called privacy first brwoser is the biggest exfiltrator of user data. Their default 5000 user settings are all geared towards feeding them and Google real time user activity feed. They have moved from being Google data bi.ch to a major data collection thug themselves. As such fingerprinting becomes at best a tangential issue.

jstewartmobile · on June 30, 2018

Sets up a straw man of "privacy defeatism", then knocks it down with an analysis of some French website's logs--as though any of this were a technological problem in the first place.

This is not a case of "defense in depth" or "something is better than nothing." This is the case of a surveillance capitalism asset performing to spec.

Privacy in the browser is/was f-ed, plain and simple. Patch one hole, make two more. That's how it's been for at least 20 years now. I see no reason for that to change until the browser is obsolesced by something even better at fleecing the peasants.