Timing attack, 6.66% faster

tptacek · on July 22, 2014

Doing 16 bits at a time instead of 8 isn't going to make Ruby string comparison timing tractable; the timing difference you're looking at is less than a nanosecond per byte, because Ruby's string comparison is essentially just C memcmp. That's below the noise floor even on a LAN.†

Even for signals above the noise floor, ie, higher than the "WAN" threshold Crosby and Wallach worked out, actually performing these attacks is very difficult. Among other things, you'll probably need to goof around with your network drivers and write your own timer code.

Timing attacks against simple string comparison might belong to a small class of attacks that get worse over time, not better, because networks do not get less noisy as fast as computers get faster and shrink the signal.

Unfortunately, not all timing attacks devolve to simple string compare. Look for instance at the Lucky 13 attack on TLS; the distinction being profiled there isn't a single byte comparison, but rather entire invocations of a hash core function.

† Here's a link to the most recent measurements I could find: http://twitter.com/tqbf/status/491607410537406465/photo/1

homakov · on July 22, 2014

I didn't mean to claim "it will work just great" I claimed "it is supposed to be 6% more efficient of what we have now".

In theory, this particular comparison attack is entirely based on amount of operations, and by making it 6% longer we make it 6% easier to detect, am I right? My point is 6% is quite a significant improvement and instead of using regular technique I see in blog posts everywhere on the Internet, we should mention the possibility of tree-like patterns. Furthermore, for [01] alphabet it is gonna be about 90% improvement.

I'm hundred percent sure the technique was discovered by someone long time ago, but I'm confused I never saw that in casual articles (I read bunch of them recently).

tptacek · on July 22, 2014

It's hard to find any articles about practical experiences with string comparison timing attacks over networks, probably because so many of the results of those experiments are going to be negative.

There definitely are circumstances where it's a concern, and, worse yet, some of those circumstances happen below the language layer, so it's not enough to say "Ruby string comparison isn't exploitable"; depending on the runtime, things could be as bad as they were in Java.

So it's still important to doc those flaws --- when they're meaningful --- but a lot of the time we set them "sev:low" or even "sev:info".

homakov · on July 22, 2014

Thanks for the link, I truly agree attack is very difficult to exploit, most of the time just impossible. I did some tests locally long time ago https://gist.github.com/homakov/4953145 and signal was extremely weak.

murbard2 · on July 22, 2014

For N large enough, there is no noise floor. In some cases, you could realistically be attacking a system for months or years to obtain a key.

xyzzy123 · on July 22, 2014

There is the threshold at which each measurement is giving you less information than brute-force. At that point, you can say "this is not vulnerable to a timing attack" even if a timing side-channel is present.

The larger the search space, the more noise you would be willing to put up with....

EDIT: I'm not sure what you mean by N. If you mean N as the size of the search space rather than N as the number of requests, I agree with you!

murbard2 · on July 23, 2014

You actually do both. Each measurement induces a probability distribution over the space upon which you bruteforce.

Also not that timing attacks are quadratic in the noise to signal ratio and linear in key length which bruteforce is exponential in keylengths. So that threshold is likely many orders of magnitude smaller than you imagine.

homakov · on July 22, 2014

In theory it works like a charm, but did anyone actually attacked for months? Speaking of web apps, locally it can work fine but in reality the more people are requesting the server the longer it will take to process our request. Days turn into years, and you have no idea if you're on the wrong path.

anuragbiyani · on July 22, 2014

A server can easily add an extra random wait to all unauthorized responses, making such attacks almost infeasible (at least for web-app case).

homakov · on July 22, 2014

1) extra random is not going to help, in long term any "noise" disappears and signal can be seen. 2) such attacks are infeasible most of the time, so we don't need to worry

anuragbiyani · on July 22, 2014

I agree that actual "signal" can still be learned over time, but if we chose additional random wait time from a distribution unknown to attacker (or even better, randomly select the distribution itself, and shuffle often), we can reduce the efficiency of attack dramatically.

But another possible way is to add a deterministic amount of additional wait time (instead of random), i.e., "lie" about computation time of all inputs consistently. If designed properly, this won't leak any signal (assuming the strategy on how to "lie" is chosen randomly itself and shuffled often, so attacker doesn't learn it).

Strategy of waiting for maximum time across all possible requests (as mentioned in your post) is sort of a special case of this, where one (deterministically) add a wait time of W - C (where W = Worst case time across all inputs, and C = actual time taken for this input). But we can use a less aggressive approach to hide the actual computation time as well.

squigs25 · on July 22, 2014

Another reason to hash your passwords!

homakov · on July 22, 2014

It has nothing to do with passwords and hashes. API keys are not hashed normally. And all kinds of signatures. It's about way you compare strings from params with secret strings

emidln · on July 22, 2014

I can think of a couple examples where hashing passwords directly contributes to leaking information recoverable by examining timing (e.g. enumerating django privilege status for a site's users).

Edit: not to say that you should store your password plaintext. Hashing (well, properly using a KDF) isn't a panacea.

tshadwell · on July 22, 2014

Perhaps another reason to use a fixed time string equality function

yeukhon · on July 22, 2014

You mean checking 123 == 1234 actually going through all 4 characters instead of immediate stop?

mikeash · on July 22, 2014

I believe it's OK to bail out early if the lengths aren't equal, but otherwise yes, you'd want e.g. 1234 == 1345 to continue to the end even though it's obvious from the second character that they aren't equal.

bmm6o · on July 22, 2014

Bailing early reveals the information that the lengths are different. In most cases the attacker knows the length of the target string and so this test will never pass. You can probably imagine a scenario where the length isn't known and step 1 is to determine the target length.

squigs25 · on July 23, 2014

Everyone who down voted this: 1) I know this would NEVEEERRRRRRR happen, but if there was a website that didn't hash a password, the timing attack would be applicable. 2) It worth mentioning that a timing attack can be frequently foiled if hashing is used on the server side. I.E. Google gives me an API key, stores the hash on there side. I send the API key to Google they hash it and ocmpare the hash to their stored hash.

I think it's crazy how fat fingered people have become with the down vote trigger finger. I think it's 100% crazy to discus the timing attack without considering how it could be thwarted.