Fingerprinting VPNs with Custom Router Firmware [pdf]

dgl · 2024-06-10T12:18:03 1718021883

I find the premise of the paper strange -- if the router is untrusted it can do many things, not just statistical traffic analysis at a high level (e.g. DNS/HTTP may give away apps downloading a list of VPN endpoints).

In 2024 IPv6 isn't mentioned, even in the future work section. While privacy addressing is often used, the address is usually rotated infrequently in terms of tracking what a device is doing for a few hours, privacy addressing aims to stop tracking over days. A router can easily see the real MAC address in the neighbour table, but right now a client device using a VPN over IPv6 is potentially trackable even beyond the local router, which seems more interesting than their local only threat model.

I wonder if any VPN clients force renewing the IPv6 privacy address, combined with careful firewall rules to avoid leaking the device's other address(es)? I suspect many clients/people just disable IPv6 out of paranoia though.

jiveturkey · 2024-06-10T19:47:32 1718048852

of course an untrusted router can do many things, it's one reason you use a VPN at all. your provider's router is ... untrustworthy. i believe the point of the paper is that even if you use VPN, the router can detect that and classify you. VPN is illegal in some countries so this is relevant. let's avoid a "security must be absolute" mindset. like, thing A is not worth doing because thing B is still flawed. the paper seeks to address a specific thing A and things B,C,D are out of scope.

in the paper abstract they specifically and only talk about the CPE router, which can identify LAN clients uniquely, but very similar thing applies to the upstream edge/border router. I believe (I only scanned it, didn't read it in detail) the paper focuses on the CPE router because the fingerprinting load is distributed and free in that case, and most often the provider owns the CPE anyway. so this is a reasonable place to focus on. however they've neglected netflow records (from provider owned upstream border/edge router) which can similarly be analyzed, offline at a leisurely pace, with arbitrary resources able to be thrown at it, and unlike a CPE firmware action cannot be detected. so i don't know how important the "router" part is.

the ability to detect VPN itself isn't novel or even interesting, but i guess their claim is in presenting a traffic analysis that requires little sophistication and few resources, something lightweight enough that it can be run at the CPE.

godelski · 2024-06-10T16:43:48 1718037828

You're probably often on an untrusted router. At least every time you're outside your house, and for many people even in their house.

I don't get why just because an untrusted router can do many things that that means you can't talk about one of those things. Sure, there may even be more interesting things, but to who? And why should that stop a conversation about other things? How would you talk about those things in any detail if not one by one? I can neither speak, nor read, nor write in parallel.

gruez · 2024-06-10T13:34:54 1718026494

AFAIK most VPNs are ipv4 only and disable ipv6, so it's a non-issue.

Workaccount2 · 2024-06-10T15:26:27 1718033187

Wouldn't VPN's want to be ipv6 so they could constantly use unique IP's?

gruez · 2024-06-10T16:47:58 1718038078

Why would they need unique IPs? They're already using NAT on ipv4 because they're sharing one server with tens/hundreds of users. Moreover, giving unique IPs per user might be better for networking purposes (ie. no need for NAT), but sucks from a product perspective, because it makes the users individually trackable. For a privacy product it makes little sense to do so.

myspeed · 2024-06-10T13:10:42 1718025042

I built libp2p tunnel based VPN(Kadugu VPN) and open sourced it. Its a custom solution, it can't be finger printed. Generated traffic uses QUIC based transport to reach other end. Just like any browser based web traffic. In future, custom VPN solution will be flooded in market that can be easily setup. Finger printing will be difficult in these solutions.

person4268 · 2024-06-10T14:25:28 1718029528

Not if the network blocks QUIC in its entirety to force clients to fallback to HTTP1. I'm pretty sure the network I'm currently on does this.

But, it looks like the type of fingerprinting in the article utilizes the fact that VPN connected devices are only connected to and sending data mostly to only one host, which using QUIC won't help with - you'd need to add some sort of "noisemaking" functionality involving sending bogus packets outside the tunnel, or possibly route VPN traffic across multiple nodes before forwarding to the actual vpn server ala Tor (as they propose in the conclusion).

j4hdufd8 · 2024-06-10T14:29:52 1718029792

> which using QUIC won't help with curious why not?

person4268 · 2024-06-10T15:07:30 1718032050

You're still sending packets to the same IP address. QUIC can't obfuscate that itself, all packets have to get routed over IP in the end. The paper relies on very little but that fact.

If one wanted to block VPN connections, they easily could do so by running such detection and then blocking all UDP (QUIC is built on UDP) traffic from the host to the suspected VPN server, too.

What QUIC helps with, in the context of dealing with DPI firewalls, is really just the obfuscation/encryption of as much connection info as possible, such as the SNI/Host in the context of an HTTP server, which normally is sent in plain text even with SSL/TLS (though ESNI efforts are starting to fix this)

generalizations · 2024-06-10T17:51:39 1718041899

So maybe the solution is to double-send all the encrypted packets - once to the VPN endpoint, and once to the original target (but encrypted so it doesn't reach). Or maybe instead of the intended target, to some randomized selection of targets. You wouldn't get responses, but maybe that doesn't matter.

hawshemi · 2024-06-10T14:32:09 1718029929

https://v2.hysteria.network/

majke · 2024-06-10T13:39:55 1718026795

> The key idea behind our threat model is that end devices using a VPN connection will, by default, send all their traffic to the same destination (the VPN server) identified by its public IP address – Refer to Figure 1. On the other hand, non-VPN traffic is typically sent to a mix of different destinations; e.g. websites, weather widget, OS update server, etc.

It's about looking at the cardinality / entropy of target ip's.

benlivengood · 2024-06-10T16:16:38 1718036198

Further, any traffic that doesn't fit the statistical properties of most other users is going to be an easy outlier to spot, and VPN users will tend to cluster with each other. E.g., if there's no traffic to any search engine or social media then VPN use is statistically more likely (or it's a point of sale device, etc.)

jmnicolas · 2024-06-10T14:56:17 1718031377

So run a torrent client that doesn't use the VPN on the same ip and you're good to go?

sakebomb · 2024-06-10T13:37:16 1718026636

Interesting concept. This would work for the moment, until the providers hear about this type of detection and change the session sizes or intervals. I am unsure what the intent of this type of detection would bring other than knowing that someone is using OpenVPN. Doing this at scale would be even more difficult.

There are quite a few work arounds which would defeat this testing methodology. Since they only tested OpenVPN, and with Wireguard becoming a bigger player, (additionally the Tailscale/Headscale's of the world), this detection would never work.

I am surprised they didn't attempt to try this with Tor since those people are more likely to be more serious about their anonymity and privacy.

fulafel · 2024-06-11T04:48:41 1718081321

Related: https://mullvad.net/en/blog/introducing-defense-against-ai-g...

yergi · 2024-06-10T15:28:12 1718033292

This is not new or novel.

crest · 2024-06-10T17:37:50 1718041070

Do you have a source for prior art you can share?

serverlord · 2024-06-11T17:43:39 1718127819

This is interesting and scary.

gruez · 2024-06-10T13:36:04 1718026564

What's the point of this? I thought all the mainstream VPN protocols are trivially detectable via DPI?

bobbob1921 · 2024-06-10T14:07:40 1718028460

Agreed, this seems more like a timing/correlation type of attack (or threat). Assuming that the same company has the ability to change the firmware on these home routers (your isp lets say), it seems to me it would be much easier for that company to just look at the same timing data on the ISP‘s own routers. (Ie customer IP address 24.3.1.7 has maintained a unusually long session with public IP address 70.4.1.8, which is a known VPN server/ vpn provider , thus that customer must be using a VPN).

Or even taking it a step further that customers IP address is communicating with known VPN server ports (or if port 443 is being used for vpn comms , then back to the original premise of the long sessions / packet counts to a single public IP address = vpn is being used.)