This doesn't work so well when you a) disallow tracking cookies and b) don't use...

snazz · on May 14, 2020

If you're not using DNS over HTTPS, then they're still tracking your DNS queries regardless of which server you use.

fullstop · on May 14, 2020

I wonder how big a hosts file could get before it degraded performance?

You can still do DNS over VPN. That is, I could set up an unbound cache on a VPS and secure my traffic to that server using wireguard or openvpn. This just pushes the eyeballs out of my home and into someone else's datacenter, though.

snazz · on May 14, 2020

I'm guessing that the time required to open the hosts file for reading would eclipse the amount of time required to parse it and do a linear search on the entries. I would think that the file would have to be measured in the hundreds of megabytes for performance to be a significant consideration, but I haven't benchmarked it.

kortex · on May 14, 2020

Just set up a pihole/dnsmasq. You'll need to point that at your desired DNS (I use 1.1.1.1, but if I were more paranoid I'd tunnel it)

brenden2 · on May 14, 2020

This is true, but it requires deep packet inspection, and that's something which usually isn't on by default. They might enable it for specific clients under some circumstances, but I haven't heard of ISPs logging that level of detail permanently. I suppose the could run a service that just inspects DNS packets, pulls out the domains, and correlates them with each client, but I haven't heard of that being deployed in the wild (although perhaps that's changed).

Their DNS severs, however, and definitely doing this, and they sell that data to 3rd parties.

TLS SNI also leaks the domain name, but again that would require deep packet inspection to extract and correlate with the clients. Definitely possible, but probably not deployed.

tialaramex · on May 14, 2020

DNS blocking is an affordable technology deployed in many countries. For example in the UK if you use the sort of large ISP advertised on TV it has DNS blocking.

With DNS blocking if you try to look up a "forbidden" FQDN you get back either a bogus NXDOMAIN or A records chosen by the blocker.

DoH bypasses DNS blocking pretty cheaply. DNSSEC would detect it and stop but doesn't bypass it. Tor bypasses it but at considerable cost.

The (eventually indefinitely delayed) UK government plans to institute mandatory censorship of the Internet relied on DNS blocking as their backstop. The idea was if anybody anywhere in the world didn't voluntarily agree to obey censorship rules, they'd be blocked in the UK. The government would just accept that some proportion of users would install Tor to bypass that restriction. DoH means "some proportion of users" potentially becomes "everybody with a modern browser" and that was not palatable.

fullstop · on May 14, 2020

DPI is available on inexpensive routers now and has been an option on Cisco/Fortigate/Palo Alto/Ubiquiti gear for ages. I have no doubt that it is heavily used at most IPSs.

brenden2 · on May 14, 2020

There's a big different between the data being available vs. being put to use. ISPs have no incentive to put a bunch of effort into collecting data from the 1% or less of their customers which don't use their DNS servers. Thus, it's extremely unlikely they run packet inspection on every packet just so that they can collect browsing history from people who are privacy conscious.

34679 · on May 14, 2020

I'm curious about a case like duckduckgo, which uses https, but puts your search term in the url.

brenden2 · on May 14, 2020

The path and query string are only transmitted over TLS, so it should be fine so long as you never use HTTP without TLS.

tialaramex · on May 14, 2020

Indeed, let's break it down:

The scheme (https) is implied but isn't transmitted anywhere. An adversary can infer you used HTTPS because it was port 443 and looks like TLS traffic.

The hostname (www.duckduckgo.com) is somewhat implied by the destination IP address on the connection, and is also transmitted in the clear as part of TLS Server Name Indication so that the receiving server knows which service you wanted. In TLS 1.2 and earlier the site's certificate is also transmitted in the clear (but this is fixed in TLS 1.3). Encrypting SNI is a work-in-progress.

The path and query string are encrypted. An adversary can't discover what they are, nor can they tamper with them successfully. The total overall amount of data sent is not hidden, but clients can (though most do not) add padding to hide exactly how much of this was "real".

The fragment identifier (#foo) is not transmitted anywhere it remains only on the client (web browser).

Headers, body and so on of both request and response are encrypted, and the same caveat about an adversary knowing how much data was transmitted and when applies.

kortex · on May 14, 2020

This is a really good summary! A few months back I was hunting for an answer as to whether url/query parameters were encrypted with HTTPS. The answer space was strangely sparse for something so critical.

tialaramex · on May 14, 2020

The answer is obvious† to somebody who knows how it works, so it's probably in a category of questions where there's not a lot of overlap between people who might ask and people who might be in a position to answer.

I remember when I answered a Stack Overflow question about how Domain Validation for certificates in the Web PKI work thinking that just a year or two earlier the answer would be hazy and probably get marked unsatisfactory even though it was true - because it was so vague and few people would be in a position to confirm it. As it happened they'd asked after the Ten Blessed Methods were formally required and so those are the answer, written down in black and white in a document I could offer as a reference for anyone to look at.

† The hostname part is less obvious than the rest it's fair to say because HTTP's Host header is just a header and thus encrypted, and you need extra insight to realise SNI needs to exist.

kyuudou · on May 14, 2020

None of it really matters though if the certificate authorities are compromised.

tialaramex · on May 15, 2020

How so?

If a CA is compromised (which was very rare, even under historically more lax oversight than today) the bad guys would need to issue themselves a cert from this compromised CA, get it logged, and then from an on-path position use the certificate to MITM you and capture your URL which presumably they wanted very badly as this seems like a really expensive approach.

The CA does not have the ability to decrypt eavesdropped HTTPS traffic for example.

anticensor · on May 14, 2020

Come here to Turkey, we have very sophisticated multilevel tracking and censorship infrastructure (deployed but not used in full potential).