And BoringSSL has excellent bindings for Rust (the "boring" crate), that support everything BoringSSL does, with a really good, easy to use API. And if you absolutely need to use pure Rust implementations, there's the "superboring" crate.
Are there any languages that properly support this, i.e. where you can express at the language level that the compiler should refrain from being to clever? If not at the language level, are there compilers which you can ask to keep an eye on the timing?
That's true in the sense of "time invariance is not guaranteed as a first class feature". But as a practical matter this is as simple as choosing an invariant algorithm and disabling optimization for the module in question.
And even if they do you're an optimizer or a processor issue away from your precious time invariant code becoming less so to the point that you leak bits.
Adding noise may make it a bit harder but does not fundamentally change anything. The measurements, especially over a network, are noisy to begin with and adding a delay will only add another noise source.
Aside from that, time invariance is the wrong term, it is constant time. Time invariant means does not change over time, what you want here is data invariant timing, i.e. the runtime not depending on the data being processed.
Determine, say, N = 10 points in the sensitive part of the algorithm. On startup / library initialization, assign N small random delays to these points. The delays stay the same in every invocation, creating a stable and unique timing profile. This,profile is hard / impossible to reproduce on a different machine for analysis and finding correlations. To make things funnier, a random delay parameter should randomly change every few weeks on long-running installations. This would render previous timing stats invalid, making known-plaintext attacks more difficult.
Realistic attacks tend to involve doing many different runs of the algorithm with different values of whatever input is timing-dependent. That includes this attack, where you’re doing many different TLS connections which result in different messages being decrypted, and the runtime depends on the message.
If you couldn’t vary the timing-dependent input, then you could effectively only measure a single duration value per target; doing multiple runs would just reduce noise. It’s very unlikely that a single duration would have enough information to recover a full key. So this is already a hopeless scenario for the attacker.
But if you can vary the timing-dependent input, then you can compare timing between different runs on the actual target machine; there’s no need to reproduce the exact timing profile on your own machine (which is very difficult anyway outside of small embedded CPUs). So a random delay per machine does not help.
Adding delays to sensitive parts of the algorithm either would be equivalent to adding a single overall delay (if the number of delay invocations is constant), or would make the attack much easier (if not).
To get time invariance I have to have guarantees provided by everything from the compiler to the hardware implementation. If I am exclusively, explicitly, and completely controlling all of those layers, good to go... But that means we know I'm at least not publishing a library for public consumption.
To get time invariance all I need is a clock. Responses are normalized to some value greater than the slowest computation. Thus, I don't need exclusive, explicit and complete control of all layers, including whatever inevitable evolution those layers incur without my knowledge.
Barring any further threat model inflation involving ammeters, spectrum analyzers or what have you, how is this not a sufficient solution?
What is a 'computation' exactly, and how do you know the duration of the slowest one? I imagine the complexity spirals as you try to answer those two questions...
Execution of whatever one proposes to code in a time invariant manner. In this case: "RSA decryption and signing operations."
> I imagine the complexity spirals
Measuring things and normalizing operation time doesn't appear terribly complex to me. How can endlessly reworking subtle algorithms for new compilers and hardware seem less complex than throttling well understood implementations with a clock?
Also, I asked a question. My question was sincere: how would using a clock to normalize operation time not be sufficient?
Measuring and normalizing the operations isn't necessarily straightforward - suppose we're in a streaming context (common for encrypting things for the internet), in a real-world environment, where computation time might depend on what other applications are running, or the temperature of the CPU; max operation time will vary over time and possibly unobservable changes in circumstances.
It's easy to spitball a seemingly 'good-enough' solution to this, but crypto doesn't seem to be a place where 'good-enough' is actually good enough.
RSA has been known to be problematic for a very long time at this point, hopefully it just fades into obscurity soon and we can move on with less error-prone ciphers.
The paper also describes how all programming languages that use BigInt data types are likely to be affected. It also specifically refers to the ECDSA TLS attack from 2015, as well as the Minerva/Raccoon attacks which work in a similar way.
As isogeny key exchange algorithms are likely to be all vulnerable after the SIKEp434, SIKEp503, SIKEp610, and SIKEp751 fiasco(s) of last year, I'm curious as to what kind of alternatives you suggest?
Honestly, at this point I'd say anyone that makes these kind of claims that there even are "less error-prone ciphers" just hasn't worked long enough in cryptography to have learned otherwise. And I'd be bold to say it's probably all not possible due to how we design the CPU hardware and their memory interactions.
Even if it's not the software (and it runs in constant time to prevent analysis) there have been lots of acoustic side channel attacks that were just analyzing the sounds that the transistors on the CPU make. So all your random no-op instructions are kind of useless against that, too.
That is only because the way RSA key exchange has been implemented everywhere involves padding. RSA-KEM[1] has been available since forever and no padding is required, though no one uses it for some reason.
RSA-KEM is also trivial to implement as it's virtually identical to textbook RSA, with the sole restriction that m has to be randomly sampled from 1 .. n-1.
You can read more about it here: https://rustcrypto.zulipchat.com/#narrow/stream/260047-RSA/t...