Total relay bandwidth in the Tor network

slashdotaccount · on Dec 11, 2013

According to the "Full Disclosure - The Internet Dark Age" paper (1) the N.S.A. can route a target who is using Tor into their "own" Tor network where they control all the nodes, thus compromising the security of the target. Does anybody knows how one can be sure that he is in the legitimate Tor network and not a fake one?

(1) http://cryptome.org/2013/12/Full-Disclosure.pdf

owenmarshall · on Dec 11, 2013

>Does anybody knows how one can be sure that he is in the legitimate Tor network and not a fake one?

I'm a bit uncertain that document is completely credible - it makes some claims that appear to be very wild, even given what we've seen so far.

But let's assume they're right about their Tor claim. Even if the NSA places you on a controlled Tor network, I could still read my gmail because the NSA would have to try to get around Google's SSL.

Some bad endpoints have been known to try and MITM SSL before, but always with self signed certificates, so if the user messed up it was down to them.

Of course, if our endpoints are run by the NSA, and the NSA have pressured other CAs to sign *.google.com, we can only depend on cert pinning. But it's the NSA, so how do we know that they haven't already backdoored my browser to change the pinned fingerprints...

But at the end of the day, the latest I'd heard from the Snowden revelations is that Tor is a major thorn in the NSA's side, so I'm inclined to believe that over this breathless PDF that makes all sorts of outlandish claims.

EDIT: From that PDF:

> When the DSL connection is established a covert DHCP request is sent to a secret military network owned by the U.S. Government D.O.D. You are then part of that U.S. D.O.D. military network, this happens even before you have been assigned your public IP address from your actual ISP.

That stretches the bounds of credulity - it assumes that a giant government-mandated conspiracy exists between ISPs, device manufacturers, networking developers... even with what we've seen so far it's impossible to swallow.

bdb · on Dec 11, 2013

The entire premise of that paper is misguided. It's fairly common for large providers[1], e.g., BT, Sprint, T-Mobile, to use several of the non-Internet-connected DOD /8s for management addresses, once they've exhausted RFC1918.

[1] http://seclists.org/fulldisclosure/2012/Feb/319

yuvadam · on Dec 11, 2013

You can read your Gmail over HTTPS, but Tor hidden services generally do not use SSL/TLS (since they _are_ the exit node for that connection).

owenmarshall · on Dec 11, 2013

Ah, that seems sensible - because if the chain is me -> a -> b -> badsite.onion, and the NSA owns a & b, I'm encrypting to each of those node's keys; and in the absence of a central Torland CA, I can't trust anything but what's visible.

So even if badsite.onion used TLS, I'd be forced to verify their certificate offline or risk

me -> a -> b -> badsite.onion (NSA fakery) <=> torchain -> badsite.onion (real)

Am I tracking? That's tricky.

nwh · on Dec 11, 2013

Remember that the URL acts as the public key. If you got the URL from a reputable source then there's no way that you could manage tot get into that situation. Just like SSL you're assured that the destination is who you think it is.

crumblan · on Dec 11, 2013

Surely the TOR protocol is equivalent encryption to SSL/TLS and thus only the exit node or one with the exit node's private key can read any traffic for the hidden service?

owenmarshall · on Dec 11, 2013

TLS works for me going to google.com because a central CA has signed the certificate presented by google.com

Apparently there are directory servers that sign the public keys for each node, mitigating this MITM attack: https://news.ycombinator.com/item?id=6888307

Without those servers, if we assume the NSA owned the network the entire point would be moot. With those servers... I guess the NSA would have to fuck with your Tor client or steal those server's private keys.

nly · on Dec 11, 2013

The Tor consensus is digitally signed. That entire 'paper' looks like sensationalist garbage. A paragraph on each topic with no sources or analysis does not qualify as research.

kyboren · on Dec 11, 2013

Agreed. As I wrote in another thread:

> I also do not think their claims of Tor subversion hold water. From what I understand of Tor, directory information (including nodes' key fingerprints) is ultimately verified by the hard-coded keys of very few "trusted" operators of authoritative directory servers. So long as the Tor software isn't compromised, no MITM, regardless of where it's effected, will be able to subvert the user's circuit construction (of course, barring bugs in Tor and exploits higher up in the software stack). At least, that's my understanding.

However, I do assume they are very interested in capturing both short-term directory signing keys and the long-term keys which verify those short-term keys. This means the Tor servers listed as "dirservers" in src/or/confic.c, but especially the nodes tor26, dizum, Tonga, gabelmoo, maatuska, dannenberg, and Faravahar, as they appear located outside the US and so probably "game on" for NSA. I also assume NSA's "partners" will try the same with those authoritative directory servers in the US, which NSA would undoubtedly use to enhance their capabilities. Those who run authoritative directory servers need to be very careful (cough Appelbaum cough).

The paper's main conclusion, "NSA/GCHQ can surreptitiously selectively MITM users' Internet connections", appears to be valid (see FOXACID servers and NSA's "man-on-the-side" attack), but this is a well-known threat model.

In my non-professional opinion, the most likely way NSA/GCHQ would attack Tor appear to be through:

1) Honeypot nodes -- Tor relays which act normally, but silently record timing and circuit routing information ("circuit 2981: $253DFF1838A2B7782BE7735F74E50090D46CA1BC -> $1041213B53CCF586093BB65D9CC4BC0B9656EF17 @t=1386775503.00312")

2) "Upstream" data analysis of regular node connections -- build statistical models of likelihood of a client circuit routing from input node X to output node Y by observing timing of transmission rate bursts

3) CNE of non-US Tor nodes -- Pwn foreign nodes to turn regular nodes into honeypots

And so I wonder: it appears little can be done about 1) and 3) above what's already being done (other than emphasizing to relay operators that Tor relays should be run on an isolated system with as few open services as possible). Tor maintainers seem to see vulnerability to 2) as a fundamental problem with low-latency mixnets, and haven't taken steps (AFAIK) to complicate that analysis. Would it be effective to do something like the following?

1) Propagate relay-to-relay latency and jitter statistics along with client circuit data

2) Have each relay build a model of incoming latency and jitter statistics for each other relay

3) Associate each new client circuit on a relay with a latency and jitter bin selected uniformly at random from the models built in 2), excluding the model associated with the true incoming relay

4) Delay all packets in that circuit by the associated latency time, +/- jitter calculated from a probability distribution created in the associated model of 2)

This might be in combination with a change that removes or smooths bandwidth spikes by inserting dummy traffic between relays.

With both strategies, a) latency would be worse, and b) total available bandwidth would be lower. It looks like there's considerably more advertised bandwidth than there is used bandwidth, so b) might be OK. But a) might be a big problem for usability.

One more harebrained scheme: have each relay pick a few 'buddy' relays whose total advertised available buddy bandwidth sums to ~50% of this relay's estimated bandwidth. For each new client circuit, flip a coin and determine whether this circuit's traffic will be forwarded directly, or given to a buddy to forward. Similarly, the buddies each do the same with this relay. Associate 'chosen' client circuits with a buddy that tends to make a symmetric and smooth bi-directional data flow between the buddies.

OK, stream of consciousness over... I really got off on a tangent. I'm very interested to see why my lame schemes won't work, so please point out why I'm wrong (as I assume I probably am) :).

nly · on Dec 11, 2013

The problem with a timing attack is you need to put a large number of nodes in to the network, and the Tor stewards monitor and blacklist suspicious arrivals. If you wanted to do this you'd need to use a diverse set of IPs gradually over time. It's definitely possible though, I agree.

kyboren · on Dec 11, 2013

But an observer who can see traffic between most Tor nodes doesn't need to run Tor relays to do attack scenario 2). I'm not sure how far that gets you, though. Maybe 2) in combination with just a few honeypot nodes would be most effective. And in any case, NSA can certainly rent an essentially-unlimited number of cloud server instances under a front company, no?

owenmarshall · on Dec 11, 2013

>And in any case, NSA can certainly rent an essentially-unlimited number of cloud server instances under a front company, no?

You might not even need to. Depending on how many ISPs have QUANTUM boxes installed you could just squat on unallocated IPs and route traffic back to Fort Meade, yea?

I mean, https://www.documentcloud.org/documents/785152-166819124-mit..., jesus, if that's out there...

owenmarshall · on Dec 11, 2013

> I also do not think their claims of Tor subversion hold water. From what I understand of Tor, directory information (including nodes' key fingerprints) is ultimately verified by the hard-coded keys of very few "trusted" operators of authoritative directory servers.

I'm very fascinated by this. Do you have any links to a faq/technical entry that focuses on these directory servers and their application?

kyboren · on Dec 12, 2013

I do not know about a FAQ, but this document describes what I mentioned: https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HE... (read the section "1. Outline").

    There is a small set (say, around 5-10) of semi-trusted directory
    authorities.  A default list of authorities is shipped with the Tor
    software.  Users can change this list, but are encouraged not to do so,
    in order to avoid partitioning attacks.
 
    Every authority has a very-secret, long-term "Authority Identity Key".
    This is stored encrypted and/or offline, and is used to sign "key
    certificate" documents.  Every key certificate contains a medium-term
    (3-12 months) "authority signing key", that is used by the authority to
    sign other directory information.  (Note that the authority identity
    key is distinct from the router identity key that the authority uses
    in its role as an ordinary router.)
 
    Routers periodically upload signed "routers descriptors" to the
    directory authorities describing their keys, capabilities, and other
    information.  Routers may also upload signed "extra info documents"
    containing information that is not required for the Tor protocol.
    Directory authorities serve router descriptors indexed by router
    identity, or by hash of the descriptor.
 
    Routers may act as directory caches to reduce load on the directory
    authorities.  They announce this in their descriptors.
 
    Periodically, each directory authority generates a view of
    the current descriptors and status for known routers.  They send a
    signed summary of this view (a "status vote") to the other
    authorities.  The authorities compute the result of this vote, and sign
    a "consensus status" document containing the result of the vote.
 
    Directory caches download, cache, and re-serve consensus documents.
 
    Clients, directory caches, and directory authorities all use consensus
    documents to find out when their list of routers is out-of-date.
    (Directory authorities also use vote statuses.) If it is, they download
    any missing router descriptors.  Clients download missing descriptors
    from caches; caches and authorities download from authorities.
    Descriptors are downloaded by the hash of the descriptor, not by the
    relay's identity key: this prevents directory servers from attacking
    clients by giving them descriptors nobody else uses.

nly · on Dec 11, 2013

There was a good overview of how the network is actively managed, and who is providing new relays and the most bandwidth, at Defcon 21:

https://www.youtube.com/watch?v=864FxA3jmHk

belorn · on Dec 11, 2013

For additional information: https://blog.torproject.org/category/tags/entry-guards

Ihmahr · on Dec 11, 2013

Yes, I have been considering i2p, the 'garlic router' which seems to have hedged against a lot of attacks that could be possible on tor.

nly · on Dec 11, 2013

I agree, I2P is interesting. I don't know why it doesn't get more press or support. According to the website some reps of the I2P team will be at 30C3 in 2 weeks time btw.

Ihmahr · on Dec 11, 2013

I have a theory,

There are 7 directory authorities in Tor, if they don't function then the tor network is dead. So that is just seven people you would need to abduct and torture (to take over the tor network) or 7 drone strikes to kill all tor traffic. I am unsure but I think this is not the case for i2p.

EDIT: So that might be a reason for NSA to support Tor over i2p.

foobarqux · on Dec 11, 2013

No one uses I2P because it hasn't been deeply studied. No one studies it because no one uses it.

anonymousDan · on Dec 11, 2013

Given the need for critical mass, is there any way some or all of the advantages of i2p could be retrofitted to tor, or are the differences more fundamental?

Ihmahr · on Dec 11, 2013

I am not sure but I think the differences are too fundamental. But it would probably benefit anonymity to merge multiple projects.

belorn · on Dec 11, 2013

isn't i2p simply tor but with 2 hops instead of 3?

nly · on Dec 11, 2013

You can select your own circuit length.

belorn · on Dec 11, 2013

How? The tech intro on the website describe tunnels (https://www.i2p2.de/techintro.html) with 3 parts.

#1 is Alice the sender or Bob the receiver

#2 is a participant

#3 endpoint gateway

Two such tunnels are used when Alice and bob communicate, for a total of 4 nodes for the communication between alice and bob.

With tor hidden services, it is the exact same scheme, but with 4 nodes per tunnel rather than 3. (or 3 nodes rather than 2 if you do not want to count alice/bob as nodes). In total, 6 nodes between is used for the alice and bob communication. (https://www.torproject.org/docs/hidden-services.html.en)

Please explain how this is incorrect. Is the tunnels as describe the I2P documentation just illustrations of how an tunnel might look like, but isn't actually how it is in practice? How does it solve the problem that tor fixes with guard nodes (as this is the context in the above comments and article)?

I am much interesting in I2P, but the above details/questions has held me back.

arpstick · on Dec 11, 2013

You can have more than 1 participating hop

Let's say we have Alice and Bob, alice wants to have 4 hops both outbound and inbound and Bob wants to have a 2 hop inbound and a 1 hop outbound

Alice's tunnels would be:

Alice -> Participant A1 -> Participant A2 -> Participant A3 -> Outbound Endpoint 1 -> ?

Alice <- Participant A4 <- Participant A5 <- Participant A6 <- Inbound Gateway 1 <- ?

Bob's tunnels would be:

Bob -> Outbound Gateway 2 -> ?

Bob <- Participant B1 <- Inbound Gateway 2 <- ?

For Alice to send to Bob and receive a reply the round trip path of the message would take would be

Alice -> PA1 -> PA2 -> PA3 -> OBEP1 -> IBGW2 -> PB1 -> Bob -> OBEP2 -> IBGW1 -> PA6 -> PA5 -> PA4 -> Alice

belorn · on Dec 11, 2013

Since the above techintro documentation seems contradicting, I looked around and found a better page at http://www.i2p2.de/how_tunnelrouting.html which explains what you said. Tunnels got a max length of 7, and is clearly easy configurable in a session config file.

Thanks, Im going to take second look now that that is cleared up.

nly · on Dec 11, 2013

The client always selects the route on any onion routing network. They have to, because only they know the route. It's been a while since I played with I2P, but i'm 95% certain you can put push the circuit length up

Ihmahr · on Dec 11, 2013

Nope, not at all.

bernatfp · on Dec 11, 2013

I don't like how this title oversimplifies the causes of growth to be the NSA scandal. I guess Silk Road also played an important role in this increase and it is not mentioned.

Ihmahr · on Dec 11, 2013

How did silk road add more bandwidth to the network? Also, SR was way before Snowden.

ressaid1 · on Dec 11, 2013

The only way to access Silk road was through TOR

gwern · on Dec 11, 2013

Yes, but that means SR users were using up Tor bandwidth, they weren't necessarily supplying any.

alan_cx · on Dec 11, 2013

I wonder how much is to do with the Piratebay, and circumventing blocked sites?

john-p · on Dec 11, 2013

loading speed is still too slow for daily use really.

dublinben · on Dec 11, 2013

Have you done your part to improve that by contributing a relay?

belorn · on Dec 11, 2013

Using tor on a daily basis depend on how people surf the web.

If you are one of those people who click on news article all day long, maybe read comments, then the added latency do not really do anything. You go to HN, open in new tabs all the interesting articles you want to read. By the time you reach the end of HN front page, the first page is loaded.