Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] NSA tracks Google ads to find Tor users (cnet.com)
105 points by fury999io 5 months ago | hide | past | favorite | 69 comments



Why would you exhume an 11-year-old article full of half-information about the state of a fast-paced arms race?


Why are tons of old articles reposted daily


[2013]


I'm not understanding this. If I use the "Tor Browser Bundle" and never use that browser for anything but Tor, and never log in to anything on that browser, how can they track me?


Tor is not resilient against timing correlation attacks.

Suppose Alfred hosts a Tor onion describing relativistic physics.

Suppose Bob cautiously uses Tor to consult such information on a regular basis.

Then depending on priviliged access or leverage on the internet backbone multiple approaches can be used:

A) suppose some regions randomly suffer internet or power black-outs. Obviously a Tor onion interacting with the Tor network is not in that region. While a Tor onion disconnected for the duration of such an event is possibly/probably in one of such regions. Similar for tor Browsers.

B) instead of waiting for spontaneous events they can be elicited (costly in case of internet blackouts, very costly in case of energy blackouts).

C) instead of disabling participation, one can randomly stall it: if ISP's at both ends co-operate or are compromised, network packets can be intentionally given known pseudorandom delays on top of the spontaneous delays. By calculating the correlation of the delays one can identify which Tor user IP address is frequenting which Tor onion host IP address. This works even if the added delays are smaller than the spontaneous delays, because the spontaneous delays are uncorrelated with the injected delays so the "correlation" of the spontaneous delays with the injected delays will average towards 0, whereas the correlation factor of injected delays will correlate with the injected delays. The number of packets necessary to have true positives raise above the noise floor depends on the relative sizes of the spontaneous variation in delays and the injected delays. If the injection delays are smaller it will take many more packets before true positives rise above the noise floor.

This article is from the time of the Snowden leaks, more than 10 years ago.

The moment they have correlated the traffic on your ISP's end, with the traffic on the specific Tor onion's ISP's end, they can just ask your ISP for your true name.

In this case the experts were convinced cookies were used, which is conceivably correct for a fraction of the users. The cookies and ads were probably multifunctionally abused: tracking random browsing, spam email for lucky hits, propagation delay injection of the advertisement packets, ...


This is understood, but that doesn't make "Google Ads" a way to exploit this.


Something I have been meaning to write up for a long time but never got around to:

I assume the reader knows the basics of asymmetric cryptography, for sake of brevity and simplicity lets us consider RSA, even though thats not the onion encryption in Tor uses. I assume the reader is familiar with the mathematics behind RSA, and the basic proofs that decrypting the encrypted number results in the original number, so familiarity with modular arithmetic, modular exponentiation etc is assumed...

I assume the reader knows the basic concept of onion routing: the sender of a packet chooses an arbitrary path through routing nodes, whose public keys are known, and first encrypts the packet for the exit node's public key, then encrypts that for the next-to-last nodes public key, and so on in a backwards fashion to finally encrypt the onion packet for the first routing node's public key. At each layer a bit of metadata is encrypted along so the routing nodes know only the next node to send their decryption to. So the N-layer encrypted packet is sent to the first routing node, which decrypts the first layer, splits the metadata from the N-1-times encrypted packet, and sends the latter to the next node mentioned in the metadata.

From the perspective of an ISP or 3 letter agency monitoring the traffic of a specific intermediate routing node, they see encrypted packets arrive, and encrypted packets leaving.

Let me first state the obvious, but which I will NOT rely on:

If the eavesdropper were to possess the capability to break RSA, they could trivially decrypt the packet and associate the incoming packets to the outgoing packets. (let us ignore that if they could break RSA, they could just decrypt the whole layered onion of the packet at once...).

To transliterate to math:

EavesDropperAbleToBreakRSA => EavesDropperAbleToTrackPackets

given "A => B" and "not A" one is unable to prove "not B", although it is tempting to jump to that conclusion. B can be true while A is false, it would just mean that the eavesdropper could track packets in an alternative manner, but how?

Lets go back to our hypothetical naive RSA implementation of Tor:

Is it really necessary to break RSA to match incoming and outgoing packets of an intermediate node?

Of course not: imagine first for simplicity that the node only received 2 incoming packets, and 2 outgoing packets.

This means the eavesdropper sees 2 incoming k+1 times encrypted packets, and 2 k-times encrypted packets, which happen to be the decryption of the incomming packets. Why break RSA if the outgoing packets ARE the decryptions? One merely needs to re-encrypt the outgoing packets with the proper metadata, given the routing node's public key, and one should end up with identically one of the 2 incoming packets, so consolidating ISP powers, or other attackers able to monitor network traffic on a sufficient number of nodes can simply track packets in the onion network. Effectively the k+1-times encrypted packet is an RSA signature of the k-times encrypted packet!!!

Suppose a random route is 5 hops long and that there are 30 routing nodes (not realistic but insightful as we will see).

Suppose only the entry node packet and the exit node packet are logged, but not the intermediate traffic. How computationally expensive would it be to guess and verify the route?

that would be 30 times 29 times 28 times 27 times 26 combinations. Each combination would consist of 5 encryptions/signature checks. Very feasible to brute force.

The reason this is insightful is that a dominant eavesdropped missing observability on a small number of links can brute force these without having to break RSA, and still verifiably confirm the actual route. It would only need to consider public keys of nodes on which observability it lacks. So this becomes expensive much quicker for entities that have less eavesdropping infrastructure, than for dominant eavesdroppers.

A security researcher who understands this potential ploy in onion routing networks will have a hard time proving the exploit in practice, because the researcher lacks the eavesdropping powers that ISP's and 3 letter agencies possess.


because their target are not nerds using tor to access their own machines or random 4ch clones.

they target politicians, whistleblowers and journalists.

if you ever volunteered to organizations helping those you quickly Learn that group is not very tech literate, have cheap limited devices, skips instructions.


I’m not aware of any sandbox escape attack for the Tor browser, but I am no expert. If there is one, even limited, it’d probably be enough to figure out a way to track you down.


The only way to anonymize it enough, also defeating any attempt at cookie/malware injection to me would be to create a VM with the strict necessary to run Tor browser and clone it for single use, that is, a script that clones the VM, opens it, let you use Tor browser, then as you close the browser the VM is also closed and deleted. The script could also create the next VM changing bits here and there for added anonymization (OS and browser signature, screen and window size, mouse settings, etc) while the old one is still running, to save time.


This is largely the motivation for TAILS: https://en.wikipedia.org/wiki/Tails_(operating_system)

And also Whonix: https://en.wikipedia.org/wiki/Whonix

Using the DVD image in a VM would largely suffice for most users. For even more security, you would use the live image on a throwaway laptop at a coffee shop or something, but that's not exactly practical for everyday use.


I remember I used immutable VMs for a lot of testing - those reset to the last saved state when shut down and made a lot of tests more easily repeatable. This is a sensible thing to do for privacy as well - any cookies you gain during your session you shed on reboot.



Does Tor Browser disable cookies? I think not.

You don’t have to login to be given a cookie that’s then stored and tracked across each new IP that Tor cycles through.


This is trivially searchable. Tb doesn't store cookies.

https://support.torproject.org/glossary/cookie/


You only have to mess up once.

Google has programs where they can identify budding extremists and correlate behavior to medical diagnoses without HIPAA exposure.

If your secret weird shit that you’re doing with Tor is of interest, they’ll eventually get a profile. Using Tor is like setting off the bat signal.


what is this program called?

The fact that we tolerate this shit is unbelievable.


I believe it’s called Opioid 360 now. It’s a collab with Deloitte.

They did similar work with ad campaigns to defuse individuals who were in danger of becoming extremists, etc.


> The fact that we tolerate this shit is unbelievable.

The fact that we tolerate it is relatively expected; the fact that Snowden leaked 90% of this stuff a decade ago and nobody cares is what's unbelievable. We kinda deserve to be surveilled if this kind of apathy is what dominates our behavior and executive function.


> nobody cares is what's unbelievable

Approximately zero Americans think this affects them, and the usual tropes "If you have nothing to hide, why do you use curtains?" result in "That's different, duh."

A few are split whether they care Ring video is available to law enforcement, most think it's a benefit.

Interestingly, all care quite a lot whether an AirBnB host has cameras inside the property. Privacy suddenly matters.

At the same time, most shrug if you assert the government has all their emails and social messaging, in a "What are you gonna do?" and "If they want to read all that more power to 'em…" way.

For people to care about most anything abstract, they must believe it affects them personally, and be able to both picture and believe a credible bad outcome.

While the AirBnB creep works, it seems everything else is filed under "That's about someone else, not me."


Yes, literally unbelievable because it doesn't exist.


> "The NSA then cookies that ad, so that every time you go to a site, the cookie identifies you. Even though your IP address changed [because of Tor], the cookies gave you away," he said.

In other words, just using tails will solve this issue because every session gives you a clean environment.


Key takeaways.

“The NSA buys ads from ad display companies like Google and seeds them around Tor's access points.”

"On the off chance that [the spam recipient] renders the HTML or clicks a link, [the NSA] can connect your e-mail address to your browser," he explained, which the NSA would have already connected to an IP address. "Using Tor or any proxy wouldn't prevent it."


If the takehome message is "run an ad-blocker with your Tor Browser, to be safe", hopefully bad people believe that, and good people don't.


Yet another reason to purge cookies often :)

Everytime I log into a site that I want to buy something from, I always clear cache, cookies, logins before and after using that site.

Yes it can be a PITA, but I think that stops other sites from looking to see what WEB sites you really care about.


Install a cookie autodelete extension. That will let you whitelist cookies you want for persistent logins and discard the rest. They can usually be configured to purge on tab closing.


Autodelete is old tech if you ask me. If you open the sites before the autodelete happens, then the tracking still happens. Temporary Containers is an addon that solves this elegantly.


This may be dumb:

Wouldnt an "AI-Container-as-A-Browser" be nifty:

Create a container that runs with an AI agent that does your browsing "for you" whereby it does the connection, cookie management, anonymously tor' wrapping as required/set/needed such that you have an abstraction between you and all your browsing, and the browser can dev/null the ads and never let them render and poison the reply with synthetic data crafting all packets that go back to the cookie providers?

I also want ti to auto crawl an delete PII from all ad / identity brokers / white-pages/scam-spam. a "Delete me from the internet" bot

I really really want this.


That really is like how Stallman is described to use the internet, just with AI.

Regarding deletion, did you know about this service? https://joindeleteme.com/


Yeah but no. I want to know how to instruct gptbots to spit out actionable code snippets that I can run to have delete stuff. I dont want to pay an Opterly, or join-delete-me a monthly subscription to delete my footprint.

I know their services are valuable, im not against them - I am saying that in the age of GPT-code-slave-bots I'd rather learn the process of figuring out how to tell the AI what I want and I also iteratively learn through the process.

Its wonderful being able to explore ideas so fluidly with the GPTs even though we know/discover their limitations, mal-intent, and other filters/guardrails/alignments and allegiances


> I am saying that in the age of GPT-code-slave-bots I'd rather learn the process of figuring out how to tell the AI what I want and I also iteratively learn through the process.

Why?

It's your choice, but everyone knows AI is like poison for deterministic problem-solving. Learning how to better rely on an unreliable machine only guarantees that you're feeble when you have to do something without it. Like relying on autopilot when you don't know how to fly a plane, or trying to get HAL-9000 to open the airlock when you weren't trained on the manual process.

Using AI to automate takedown requests is just pointless. The only reason automated takedowns work is that their automated messages are canned and written by lawyers with a user as the signatory. If you have AI agents write custom and non-binding requests to people that hold your data, nobody will care. At that point you may as well copy-and-paste the messages yourself and save the hassle of correcting a brainless subordinate.

> Its wonderful being able to explore ideas so fluidly with the GPTs even though we know/discover their limitations, mal-intent, and other filters/guardrails/alignments and allegiances

It's as if the first-world has rediscovered the joy of reading, after a brief affair with smartphones, media paranoia and a couple election cycles dominated by misinformation bubbles. Finally, an unopinionated author with no lived-experience to frame their perspective! It's just what we've all been waiting for (besides the bias).


You misunderstand what I am saying. I LOVE reading and learning.

What I said what I like to tell the bots to give me a python snippet to do a chore, and explain to me how they are doing it and teach me along the way, and document the functions so i can learn, and read them and know what they do,

For example, and HNer posted their VanillaJSX Show HN: today - and it had some interesting UI demos -- so I am right now building a Flask app that uses VanillaJSX elements for a player.js tied to yt-dlp to have a locally hosted Youtube front page player that will download the youtube video locally, display my downloads on a page with a player and the VanillaJSX player elements, just so I can see if I can.


But why would you use AI for that when it's inferior to the Flask docs and the VanillaJSX example code? I've written projects with ChatGPT in the driver's seat before, you end up spending more time debugging the AI than your software. For topics like history or math I would expressly argue that relying on AI as a study partner only makes you dumber. The way it represents code really isn't too far off either.

Again, it's your choice how to spend your time. I just cannot fathom the idea of learning from an LLM, a tool designed to expand on wrong answers and ignore the discrimination between fantasy and reality. It feels to me like this stems from a distrust in human-written literature, or at least a struggle to access it. Maybe this is what it feels like getting old.


When one doesnt know how to frame the questions succinctly, a GPT that can structure a vague request into a digestable response... plus - you act like one using the LLM is completely devoid of their own discernment capabilities or intelligence.

When I pickup a hammer, I dont expect it to build the actual house. But when I Intent and WIllfully use it on its designated task, its the same as weilding a GPT/AI - you have to be really specific.

I admit I totally agree with having to debug the code that it generates. But since I know how to goad it into my intent for learning a thing.

Also - I wrote up extensively about using a discernment lattice to attempt to corral the AIs "expertness" as much as possible to keep it on a subject.

I also force it to use, cite and describe sources I tell it to use when I am telling it to be an expert.

https://i.imgur.com/Fi5GYRl.png

https://i.imgur.com/wlne9pT.png

https://i.imgur.com/Ij8qgsQ.png


Block third party cookies will end most tracking. Block their JS and you get the benefit of a faster browser.


I do that as well. The full setup is Firefox with Strict protection, Multi-Account containers, Temporary Containers, Privacy Settings, Decentraleyes, and uBO. What's unfortunate is I start getting sites that don't like this treatment at all. IKEA straight-up tells me that I'm probably a bot so it won't let me log in, but other sites have random broken functionality as well.


If you’d like to, feel free to reach out to me on robin.whittleton@ingka.ikea.com so that we can try to fix that IKEA behaviour. Or file an issue on webcompat.com and I’ll track it from there.


What is the benefit of an extension over just configuring your browser to delete cookies by default?


> That will let you whitelist cookies you want for persistent logins and discard the rest


What browser can't do that?


I don't see a way to exclude certain site cookies from deletion in Chrome. Which browser have you seen with this option?


Cookie Autodelete can trigger on tab close rather than app exit


I use Firefox with Temporary Containers. Each tab is a brand new context automatically. Tabs don't talk to each other, each of them is separate - although, if you want, you can open a new tab in the same context, and even make permanent contexts. Closed tabs' contexts get purged after some minutes.


I've opted to just incognito the majority of my browsing. Likewise it's annoying to keep logging in and sometimes I wish I had something from my history. On the other hand, I never have to worry about cookies


Except even incognito persists and shares cookies across all incognito sessions for as long as at least one incognito window is open. Cookies will not be erased until you close every window.


Yep. So if I'm in a mood for music, YouTube learns what vibe I'm going for. Full algorithmic power bends to my will. Then I close the browsers and get a clean slate. Perfect


Firefox has an old slightly broken feature that wipes all cookies, except from a set of origins you whitelist. It actually saved me about half a gigabyte of pure cookie nonsense and made website loading quite a bit faster. Soon after I set it up, they announced their third-party cookie sandboxing, but I still think there's no reason to keep all the adtech trash on your computer in any capacity.

about:preferences -> Privacy & Security -> Cookies and Site Data -> [x] Delete cookies and site data when Firefox is closed -> Manage Exceptions

Just don't forget to back up your `~/.mozilla/firefox/*.default-release/cookies.sqlite*` beforehand.


Imagine the world without internet ads. Journalism wouldn't be a click bait race to the bottom, news would still be relatively unbiased, and the nsa would have one less massive vector to track you with.

I'm honestly just waiting for people to realise that online ads are the root cause of most of the things people complain about.

Fake news? Check

Surveillance state? Check

Screen addiction? Check

Lack of nuance in any debate? Check

Unsavoury geopolitical influence? Check

The advertising industry somehow manage to stay relevant, despite the fact that their business is literally the same as the dictionary definition of brain washing.

Ah well, old man yells at clouds...


You can even imagine an alternate reality where the central ad serving entity emerged as pull-based instead of push-based.

Ads can, in theory, serve a useful purpose, informing individuals of products and services which would legitimately make their lives better (e.g., I bought a low-end immersion blender a year or two ago, didn't know they existed too far beforehand, and am quite happy with the ease/safety improvements over any other blending strategy I used to have, especially for bulk and/or hot liquids, especially compared to what I paid and how much space it takes, but without _some_ kind of ad I might never have known about the product (not a perfect example, since I learned about them from a friend, but hypothetically)).

The push-based ad ecosystem has a tacit assumption that people don't want the products and services being sold. That's a mostly true assumption, but instead of the solution being filtering to better products, well-vetted products, avoiding added-cost-without-added-benefit lookalike products, not advertising outright frauds, ..., the industry has opted for more invasive ways of forcing us to watch things we won't ever care about and siphoning invasive tax/healthcare/... information to slightly reduce the miss-rate in ad serving.

That's probably inevitable without regulation (it's cheaper to bully people into watching ads than to improve your ad inventory, with the side benefit that as an ad network you profit when suckers fall for the frauds too, plus it's easier to charge the company making money instead of the end consumer, so a profit-focused company will naturally swim that direction). As an alternative business model though, imagine great search tools on top of a pool of better ad inventory, where you could choose the demographic info and interests you wanted to be considered for a particular search session instead of having that inferred from your browsing history and the raw copies of your paystubs your employer is likely selling.


I guarantee you would have learnt about that immersion blender without Adtech.

I'm not saying the industry can't be useful, I'm saying that it's broken.


The root cause? You've missed the giant elephant that's standing right there in the room. It's strange how the elephant managed to make itself invisible. I agree the elephant shit is really bad and really stinks, we should do something about the shit, but we should also do something about the elephant.


What's the elephant? I feel like we agree, but we're talking past each other.


Either a “veiled” political jab, or someone implying that money is the elephant, and as soon as we cure that pesky human nature thing all the problems are solved.


liberalism isn't human nature, thinking it's an unavoidable immutable natural law instead of just another ideology we decided upon and can change, is exactly why it's invisible to people.


(neo)liberalism


Sure. But it's also an Internet without a lot of the web sites people enjoy.

If it ends the "brainwashing" it would be because people would not be on the Internet at all. And maybe that's a net good for the world. But here are you and me, on a web site that is itself basically an advertisement for a VC firm.


There's an even deeper foundation to this problem: copyright.

Artists must get paid, or they will either starve or stop making art.

This is the fundamental threat we have structured our entire civilization around. Art must be labor. Labor must have monetary value. Without income, people must starve and die.

To support this system, we have the most untenable law of the digital age: copyright. The most trivial act, to copy data, shall be monopolized.

But copyright didn't stop there. It grew. We use it to censor. We use it to moderate. We use it to end fraud. We use it to prevent libel. We use it to guarantee collaboration of work. Copyright has become the swiss army knife of law.

When a dull knife slips, it cuts deep.


How would removing ads help with any of that stuff? Desire for power and influence are not byproducts of ads.


Ads are an assault on the mind


Ads literally dictate what you see online. They don't create the desire for power and influence, but they do create the structure for achieving power and influence.


You think ads are the only way for power and influence to be achieved?


Of course not, they're just the easiest.


But ads are an inevitable effect, no matter if they are online or "offline", except if you think that ads should be prohibited which would turn to a different discussion.


Oh, we're in that discussion. I have yet to see a valid argument for advertising of any sort that outweighs the negatives that the industry currently displays.

We're well beyond the arguments for the recommendation graphs and the open market. Adtech, as it's currently practiced, is basically rage farming in disguise.

Gotta maximise that emotional quotient.


My comment was neither against or in favor of ads, I have a bad feeling with ads like everyone else but that doesn't mean that I think I should prohibit them.


> We're well beyond the arguments for the recommendation graphs and the open market.

Why? If people prefer these options, I don't see how forcing them into an alternative is any better. People on X could enjoy a free experience with no ads if they used Mastodon instead, but they actively seek out X. They want an ads-included package because they feel that the value is stronger than the alternatives.

I despise ads, but I don't think it's fair to characterize ads as market abuse any more than paid services are abusive. People consider the deal fair, they don't care about surveillance capitalism and they want to watch their YouTube video.


We passively seek out community, and X is one such. The advertising ROI on X is probably dropping rapidly as we speak. I regard this as a good thing.


I don't have to imagine it, every device I browse the web on has uBlock Origin and I've refused to use anything else for the past 5 years.


[flagged]


This is a three hour advertorial for antipsychotic drugs presented as two homeless crazies waiting for a bus.

I actually thought Thiel was more coldly rational than this nonsense, so I appreciate the enlightenment after reading the transcript.

Note/Disclaimer: I have sufficient Gates Foundation inner knowledge to know this is just nuts.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: