According to the "Full Disclosure - The Internet Dark Age" paper (1) the N.S.A. can route a target who is using Tor into their "own" Tor network where they control all the nodes, thus compromising the security of the target. Does anybody knows how one can be sure that he is in the legitimate Tor network and not a fake one?
>Does anybody knows how one can be sure that he is in the legitimate Tor network and not a fake one?
I'm a bit uncertain that document is completely credible - it makes some claims that appear to be very wild, even given what we've seen so far.
But let's assume they're right about their Tor claim. Even if the NSA places you on a controlled Tor network, I could still read my gmail because the NSA would have to try to get around Google's SSL.
Some bad endpoints have been known to try and MITM SSL before, but always with self signed certificates, so if the user messed up it was down to them.
Of course, if our endpoints are run by the NSA, and the NSA have pressured other CAs to sign *.google.com, we can only depend on cert pinning. But it's the NSA, so how do we know that they haven't already backdoored my browser to change the pinned fingerprints...
But at the end of the day, the latest I'd heard from the Snowden revelations is that Tor is a major thorn in the NSA's side, so I'm inclined to believe that over this breathless PDF that makes all sorts of outlandish claims.
EDIT: From that PDF:
> When the DSL connection is established a covert DHCP request is sent to a secret military network owned by the U.S. Government D.O.D. You are then part of that U.S. D.O.D. military network, this happens even before you have been assigned your public IP address from your actual ISP.
That stretches the bounds of credulity - it assumes that a giant government-mandated conspiracy exists between ISPs, device manufacturers, networking developers... even with what we've seen so far it's impossible to swallow.
The entire premise of that paper is misguided. It's fairly common for large providers[1], e.g., BT, Sprint, T-Mobile, to use several of the non-Internet-connected DOD /8s for management addresses, once they've exhausted RFC1918.
Ah, that seems sensible - because if the chain is me -> a -> b -> badsite.onion, and the NSA owns a & b, I'm encrypting to each of those node's keys; and in the absence of a central Torland CA, I can't trust anything but what's visible.
So even if badsite.onion used TLS, I'd be forced to verify their certificate offline or risk
me -> a -> b -> badsite.onion (NSA fakery) <=> torchain -> badsite.onion (real)
Remember that the URL acts as the public key. If you got the URL from a reputable source then there's no way that you could manage tot get into that situation. Just like SSL you're assured that the destination is who you think it is.
Surely the TOR protocol is equivalent encryption to SSL/TLS and thus only the exit node or one with the exit node's private key can read any traffic for the hidden service?
Without those servers, if we assume the NSA owned the network the entire point would be moot. With those servers... I guess the NSA would have to fuck with your Tor client or steal those server's private keys.
The Tor consensus is digitally signed. That entire 'paper' looks like sensationalist garbage. A paragraph on each topic with no sources or analysis does not qualify as research.
> I also do not think their claims of Tor subversion hold water. From what I understand of Tor, directory information (including nodes' key fingerprints) is ultimately verified by the hard-coded keys of very few "trusted" operators of authoritative directory servers. So long as the Tor software isn't compromised, no MITM, regardless of where it's effected, will be able to subvert the user's circuit construction (of course, barring bugs in Tor and exploits higher up in the software stack). At least, that's my understanding.
However, I do assume they are very interested in capturing both short-term directory signing keys and the long-term keys which verify those short-term keys. This means the Tor servers listed as "dirservers" in src/or/confic.c, but especially the nodes tor26, dizum, Tonga, gabelmoo, maatuska, dannenberg, and Faravahar, as they appear located outside the US and so probably "game on" for NSA. I also assume NSA's "partners" will try the same with those authoritative directory servers in the US, which NSA would undoubtedly use to enhance their capabilities. Those who run authoritative directory servers need to be very careful (cough Appelbaum cough).
The paper's main conclusion, "NSA/GCHQ can surreptitiously selectively MITM users' Internet connections", appears to be valid (see FOXACID servers and NSA's "man-on-the-side" attack), but this is a well-known threat model.
In my non-professional opinion, the most likely way NSA/GCHQ would attack Tor appear to be through:
1) Honeypot nodes -- Tor relays which act normally, but silently record timing and circuit routing information ("circuit 2981: $253DFF1838A2B7782BE7735F74E50090D46CA1BC -> $1041213B53CCF586093BB65D9CC4BC0B9656EF17 @t=1386775503.00312")
2) "Upstream" data analysis of regular node connections -- build statistical models of likelihood of a client circuit routing from input node X to output node Y by observing timing of transmission rate bursts
3) CNE of non-US Tor nodes -- Pwn foreign nodes to turn regular nodes into honeypots
And so I wonder: it appears little can be done about 1) and 3) above what's already being done (other than emphasizing to relay operators that Tor relays should be run on an isolated system with as few open services as possible). Tor maintainers seem to see vulnerability to 2) as a fundamental problem with low-latency mixnets, and haven't taken steps (AFAIK) to complicate that analysis. Would it be effective to do something like the following?
1) Propagate relay-to-relay latency and jitter statistics along with client circuit data
2) Have each relay build a model of incoming latency and jitter statistics for each other relay
3) Associate each new client circuit on a relay with a latency and jitter bin selected uniformly at random from the models built in 2), excluding the model associated with the true incoming relay
4) Delay all packets in that circuit by the associated latency time, +/- jitter calculated from a probability distribution created in the associated model of 2)
This might be in combination with a change that removes or smooths bandwidth spikes by inserting dummy traffic between relays.
With both strategies, a) latency would be worse, and b) total available bandwidth would be lower. It looks like there's considerably more advertised bandwidth than there is used bandwidth, so b) might be OK. But a) might be a big problem for usability.
One more harebrained scheme: have each relay pick a few 'buddy' relays whose total advertised available buddy bandwidth sums to ~50% of this relay's estimated bandwidth. For each new client circuit, flip a coin and determine whether this circuit's traffic will be forwarded directly, or given to a buddy to forward. Similarly, the buddies each do the same with this relay. Associate 'chosen' client circuits with a buddy that tends to make a symmetric and smooth bi-directional data flow between the buddies.
OK, stream of consciousness over... I really got off on a tangent. I'm very interested to see why my lame schemes won't work, so please point out why I'm wrong (as I assume I probably am) :).
The problem with a timing attack is you need to put a large number of nodes in to the network, and the Tor stewards monitor and blacklist suspicious arrivals. If you wanted to do this you'd need to use a diverse set of IPs gradually over time. It's definitely possible though, I agree.
But an observer who can see traffic between most Tor nodes doesn't need to run Tor relays to do attack scenario 2). I'm not sure how far that gets you, though. Maybe 2) in combination with just a few honeypot nodes would be most effective. And in any case, NSA can certainly rent an essentially-unlimited number of cloud server instances under a front company, no?
>And in any case, NSA can certainly rent an essentially-unlimited number of cloud server instances under a front company, no?
You might not even need to. Depending on how many ISPs have QUANTUM boxes installed you could just squat on unallocated IPs and route traffic back to Fort Meade, yea?
> I also do not think their claims of Tor subversion hold water. From what I understand of Tor, directory information (including nodes' key fingerprints) is ultimately verified by the hard-coded keys of very few "trusted" operators of authoritative directory servers.
I'm very fascinated by this. Do you have any links to a faq/technical entry that focuses on these directory servers and their application?
There is a small set (say, around 5-10) of semi-trusted directory
authorities. A default list of authorities is shipped with the Tor
software. Users can change this list, but are encouraged not to do so,
in order to avoid partitioning attacks.
Every authority has a very-secret, long-term "Authority Identity Key".
This is stored encrypted and/or offline, and is used to sign "key
certificate" documents. Every key certificate contains a medium-term
(3-12 months) "authority signing key", that is used by the authority to
sign other directory information. (Note that the authority identity
key is distinct from the router identity key that the authority uses
in its role as an ordinary router.)
Routers periodically upload signed "routers descriptors" to the
directory authorities describing their keys, capabilities, and other
information. Routers may also upload signed "extra info documents"
containing information that is not required for the Tor protocol.
Directory authorities serve router descriptors indexed by router
identity, or by hash of the descriptor.
Routers may act as directory caches to reduce load on the directory
authorities. They announce this in their descriptors.
Periodically, each directory authority generates a view of
the current descriptors and status for known routers. They send a
signed summary of this view (a "status vote") to the other
authorities. The authorities compute the result of this vote, and sign
a "consensus status" document containing the result of the vote.
Directory caches download, cache, and re-serve consensus documents.
Clients, directory caches, and directory authorities all use consensus
documents to find out when their list of routers is out-of-date.
(Directory authorities also use vote statuses.) If it is, they download
any missing router descriptors. Clients download missing descriptors
from caches; caches and authorities download from authorities.
Descriptors are downloaded by the hash of the descriptor, not by the
relay's identity key: this prevents directory servers from attacking
clients by giving them descriptors nobody else uses.
I agree, I2P is interesting. I don't know why it doesn't get more press or support. According to the website some reps of the I2P team will be at 30C3 in 2 weeks time btw.
There are 7 directory authorities in Tor, if they don't function then the tor network is dead. So that is just seven people you would need to abduct and torture (to take over the tor network) or 7 drone strikes to kill all tor traffic.
I am unsure but I think this is not the case for i2p.
EDIT:
So that might be a reason for NSA to support Tor over i2p.
Given the need for critical mass, is there any way some or all of the advantages of i2p could be retrofitted to tor, or are the differences more fundamental?
Two such tunnels are used when Alice and bob communicate, for a total of 4 nodes for the communication between alice and bob.
With tor hidden services, it is the exact same scheme, but with 4 nodes per tunnel rather than 3. (or 3 nodes rather than 2 if you do not want to count alice/bob as nodes). In total, 6 nodes between is used for the alice and bob communication. (https://www.torproject.org/docs/hidden-services.html.en)
Please explain how this is incorrect. Is the tunnels as describe the I2P documentation just illustrations of how an tunnel might look like, but isn't actually how it is in practice? How does it solve the problem that tor fixes with guard nodes (as this is the context in the above comments and article)?
I am much interesting in I2P, but the above details/questions has held me back.
Since the above techintro documentation seems contradicting, I looked around and found a better page at http://www.i2p2.de/how_tunnelrouting.html which explains what you said. Tunnels got a max length of 7, and is clearly easy configurable in a session config file.
Thanks, Im going to take second look now that that is cleared up.
The client always selects the route on any onion routing network. They have to, because only they know the route. It's been a while since I played with I2P, but i'm 95% certain you can put push the circuit length up
I don't like how this title oversimplifies the causes of growth to be the NSA scandal. I guess Silk Road also played an important role in this increase and it is not mentioned.
Using tor on a daily basis depend on how people surf the web.
If you are one of those people who click on news article all day long, maybe read comments, then the added latency do not really do anything. You go to HN, open in new tabs all the interesting articles you want to read. By the time you reach the end of HN front page, the first page is loaded.
(1) http://cryptome.org/2013/12/Full-Disclosure.pdf