I don't want to snipe but it's a peeve of mine: every article about AES talks about SubBytes and ShiftRows or whatever, which is information you will never, ever use, even if your career takes a turn into cryptography engineering, but nobody talks about block cipher modes, which are basically the most important thing you can know about block cryptography.
Exactly. This article is titled "Understanding how AES encryption works", but it doesn't describe how AES encryption works! It only describes how AES works.
AES is a block cipher, basically an approximation of a "Pseudorandom Permutation" (PRP).[1]
A PRP can be used for secure (authenticated) encryption, eg in AES-CCM or AES-GCM. PRPs can be used for semi-secure (unauthenticated) encryption, eg AES-CTR or AES-CBC. PRPs can be used to make a Message Authentication Code, eg CMAC. PRPs can be used to make a hash function (no direct example with AES, but Whirlpool uses a modified AES, and one could also use it in a Sponge construction, and Keccak uses an internal PRP). PRPs can be used to make a CSPRNG (random number generator), eg AES-DRBG. PRPs can even be used to make a public-key signature scheme: make a hash function (say, via a Sponge construction) and then make a Merkle hash-based signature scheme! The only common operation I don't know how to make with AES (or another PRP) as the main component is a Key Encapsulation Mechanism.
Calling this an article on how "AES Encryption" works is a bit like writing an article "How Programming in C works" and then describing NAND gates, ALUs, instruction decoding, and other bits of how a CPU works. It's the wrong level entirely given the title.
[1] A PRP is a sort of keyed super-shuffle. I'll use decks of cards for this analogy. A PRP is a bit like shuffling 52 decks of cards (in a deterministic way based on the state of an input deck) and taking the top card from each, then returning the resulting 52 cards as a "deck" of output. There can be repeated cards. A deck of all one card (say, ace of spades) will come out totally different, while a normal shuffle would just get you back 52 ace of spades. The determinism lets you repeat this process by knowing the input deck's state.
To be fair, I'm not in cryptography engineering and had to understand all those details for an implementation security review.
But I completely agree on the block cipher modes thing, a lot of people don't realize what they are doing and just use whatever sounds good, even if it is ECB and it is not relevant in the context.
Can you say something the circumstances that lead to a security review of something that implements its own AES and that implementation is reviewed by (self-described) nonspecialists?
With modern CPU-cores hitting 2x AES-instructions per clocktick, CBC-mode is going obsolete. CBC-mode cannot perform 2-AES iterations in parallel. You need to use CTR-mode.
CTR-mode however has been largely subsumed by GCM (Galois Counter Mode), and all of a sudden you need to learn Galois fields anyway (which AES is an excellent case-study in GF(2^8)).
------
AVX512 performs 4x AES-instructions per clock tick by the way, to fill up the entire 512-bit register. You're only reaching that parallelism with CTR mode or GCM mode.
I disagree that serpent is superior to AES. The evidence that it's secure is much flimsier than that for AES. It's seen far less analysis, and suffers from many of the same implementation issues as AES (hard to make both constant time and fast without hardware support, and has no hardware support). I'd have agreed with you 10 years ago, but there have been a lot more (failed) attack attempts on AES in the intervening decade than there have been on Serpent, which increase my confidence in AES.
yea, I agree - this is being shared a lot on threads about cryptography on HN but, anyway, I'll just share it again, can't harm. The (fantastic) challenges on http://cryptopals.com/ do a pretty great job in explaining and outlining the differences between various cipher modes
Block ciphers are modelled formally as indexed permutation groups, which you do round-by-round, irrespective of the mode of operation. ShiftRows and SubBytes heavily impact this formal model while the mode of operation is not. Your thinking is too practical and "close to code" and as usual, it makes the false claim that the mathematical rigor is unnecessary in cryptology
Why do you think tptacek's thinking "makes the false claim that the mathematical rigor is unnecessary in cryptology"? It doesn't seem to me like either the article or the comment talk about mathematical rigor.
But it’s an article about AES, which is the block cipher. Block cipher modes are important to a larger solution built on AES, but they are orthogonal and not even specific to AES.
No, this is about "How AES encryption works". Emphasis mine. AES internals are interesting, but they're not specific to how AES encryption works. EG AES-CTR-DRBG is a CSPRNG using AES, not an encryption system. AES-CMAC is a MAC using AES, not an encryption system. AES can be used for lots of non-encryption tasks, so the author should have either dropped "encryption" from the title or included information on what's needed to turn AES from a block cipher into an encryption algorithm.
The implementation of AES isn't even that interesting; it's not especially clever and doesn't leverage any deep mathematical principles. You can learn a lot more reading about cipher modes, as you mention. See GCM for a very interesting one.
While we're here, Christof Paar's 'Introduction to Cryptography' lecture series (freely available on YouTube here in English: https://www.youtube.com/channel/UC1usFRN4LCMcfIV7UjHNuQg, originally delivered at Ruhr University Bochum) is a really excellent explanation of the internals of a number of crytographic algorithms. The lecture on Galois Fields (https://www.youtube.com/watch?v=x1v2tX4_dkQ) is especially useful in understanding AES, and explains where some of the operations described in the blog linked here actually come from (i.e: S-Boxes used in AES are derived from inversion in finite fields).
We used Christof Paar's book in my crypto class this past semester. The book was great (much better than my professor) and the explanations of how number theory makes up the majority of the math our modern encryption uses were great. I had never heard of a Galois field until that course, and his text book did a great job explaining what they are and why it makes AES possible.
In general, encryption algorithms are so neat. We take some obscure (at least to me) math topics and we can use them in ways to create these incredible algorithms.
Right now, the most exciting thing in the world of crypto for me is the NIST competition for a post-quantum crypto algorithm. I'm very excited to see what comes out on top.
I love the shoutout. As a grad student at WPI somewhere around 2000 or so when AES was first, I took his "Cryptography and Data Security" course and my end of class project was a complete implementation (in C) of Rijndael. (I later found out that others in the class chose to implement portions of the cipher and not the whole thing.) It was definitely one of my favorite grad classes.
That tutorial looks really cool. I'm slightly concerned with the tutorial leaving the implementation details to the user, and the tutorial basically saying:
"Remember: to test your function, you can use the test vectors from the appendix A.1 of the AES standard. "
Bad AES implementations suffer from variety of attacks including cache timing attacks etc. and since you've worked for NCC etc. I'm sure you know a ton about this including about cache timing attacks. The guide isn't ready yet, and given that there's upcoming parts about linear/differential cryptanalysis of AES, it might actually be beneficial to have a weak implementation to attack later to show why AES shouldn't be written from scratch.
While the writing about the attacks is in the works, in your opinion, would it be worth underlining that the implementation the reader has produced should not be used in production, and that the user has only produced an educational textbook version to learn the basics about the cipher?
I don‘t think this tutorials aim is to use this implementation in production. I can understand the security risks but it‘s easier to understand and maybe improve a technology when one takes it apart first.
Agreed. I hope the guide goes as far as to start explaining how AES-NI can be utilized, and how constant time implementations can be written. I think the only improvements we can expect for AES (considering it's standardized) is wider deployment of proper implementations of it.
Looks really nice and educational. I'm a big fan of reeinventing the wheel for understanding certain primitives better. I did the same for ECC. Once it's done it's really liberating because you don't have to consider these things black magic anymore.
> I don't want to snipe but it's a peeve of mine: every article about AES talks about SubBytes and ShiftRows or whatever, which is information you will never, ever use, even if your career takes a turn into cryptography engineering, but nobody talks about block cipher modes, which are basically the most important thing you can know about block cryptography.
This article is called, "Understanding how AES works". Are you upset that the author wrote such a paper? This paper isn't intended to explain how to use AES. Rather it is intended to satisfy the readers curiosity about how it works.
This is called "Understanding How AES Encryption Works", but it doesnt describe that.
AES is a "block cipher", a primitive that approximates a pseudorandom permutation (PRP) and can be used to build a variety of functions. E.g. AEADs like AES-CCM and AES-GCM, encryption like AES-CTR, message authentication codes like CMAC, hash functions like Whirlpool (that uses a modified AES), and random number generators like AES-DRBG. In theory one could even build a public-key signature scheme from AES: use it in a Sponge construction to make a hash function, then use that to construct a hash-based signature scheme. The only thing I don't know how to build using just a PRP is a Key Encapsulation Mechanism.
This article is a bit like writing "How To Program In C" and talking about NAND gates and ALUs and instruction decoders.
Wait, this is the official Go implementation of AES for when AES-specific hardware instructions aren't available? It's not bitsliced. Doesn't that make it vulnerable to timing attacks?
Is there any software-base implementation of AES that is NOT vulnerable to timing attacks and other side channel weakness and with reasonable performance (to the limit of the hardware)?
Yes, that's the generic solution. They claim ~7 cycles per byte for 4096-byte blocks on the newest CPU tested, but I don't know what the performance would be on CPUs from 2020. (For context, per https://eprint.iacr.org/2018/392, AES-NI is more than ten times faster.)
Of course, AES-NI is generally preferable if you have it; and people use ChaCha on mobile platforms that don't have AES-NI but do have NEON (or another SIMD).
> It involves multiplication operations in a finite field, hence this step is a bit tough to describe. See Wikipedia for more details.
Woah woah woah!!!! MixColumns is incredibly important to understanding AES! I don't think it should be glossed over, or just pointed to Wikipedia (which is... sub-par IMO... as an explanation source).
Lets break things down:
1. Galois Fields / Finite Fields are a special number system. Instead of "choosing better numbers", Mathematicians "choose better addition/multiplies". That's right, you change the definition of addition / multiply to better suit your mathematical needs.
2. All operations in a finite-field self-feed back into the same finite-field... I joke that its a "human centipede" of math because you can just keep feeding yourself the same crap! Addition, Subtraction, Multiplication, Division, Logarithm, Exponent, Square-roots, Cube-Roots, etc. etc. All operations are GUARANTEED to return to the finite field specified. In the case of AES, the 2^8 field (256 "numbers", usually labeled 0 through 255) is chosen. No matter how crazy the math gets, you always return to the finite-field at every step.
2.5 -- Technically, they're not actually numbers... they're polynomials. But because they're represented by 0x00 through 0xFF, you can think of them as numbers with weird add/multiply rules.
2.75 -- Knuth notes that real numbers are just polynomials anyway. 525600 == 5 * 10^5 + 2 * 10^4 + 5 * 10^3 + 6 * 10^2. If you're having issues thinking about "GF polynomials are pretending to be numbers", just think about normal numbers, which always have a polynomial representation. The radix-point / decimal-point is just where the 10^0 is located, and then 10^-1, 10^-2 (etc. etc) move forward. Then, instead of having "10" as a specified radix, the radix is now "x" (the polynomial's variable).
3. Finite Field division is very, very similar to "normal" division. As you may remember from elementary school, division "mixes up the numbers real good". Well, in Finite Field arithmetic, all divisions can be optimized to a multiplication. This matches your elementary-school level thinking: 5/7 is "5 divided by 7", but ALSO "5 times 1/7th" in normal math. The same is true in Finite Fields, EXCEPT 5/7th is actually a number (erm... polynomial) in the 0x00 to 0xFF space. Also 5/7 == 5 * (1/7) == 5 * 7^-1.
3.5 -- The magic of making 5/7 == 5 * 1/7 == 5 * 7^1 is WHY cryptographers use Galois Fields. When the math / arithmetic becomes more important than the numbers themselves, its very natural to just switch to GF-field representation.
4. Well... hold on. We have GF(2^8) "numbers" (erm... 8-bit polynomials) but AES is over 128-bits. Well... GF(2^8) is more efficient to implement in software because you only need a lookup table of size 256. (From a software perspective: you can either make addition or multiplication efficient on computers. The other operation needs a lookup table. Most programmers choose "XOR" to be the efficient add, and then a lookup table for multiply/divide).
4.5 Because we're stuck with GF(2^8) (because it's the mid 90s and GF-instructions don't exist on CPUs yet and you want tiny lookup tables that fit inside of tiny L1 caches of tiny 90s computers), we extend the GF(2^8) == 8-bit by making a 4x4 matrix (128-bits total for the full 4x4 matrix, each column a 32-bit integer).
5. Instead of just doing one or two multiply / divide operations per element, lets "mix up the numbers real good" with a Matrix-multiplication.
6. As you may remember from linear algebra class: the inverse of a matrix doesn't necessarily exist. But Galois Fields make it easier to find matrix-inverses. In particular, division is always possible, so its far easier to find an inverse of a matrix.
6.5 Assume we were using "normal 8-bit integers" instead of GF(2^8), and we have a simple [[1 0] [0 2]] 2x2 Matrix. To invert the matrix, you need to divide by 2, but what is 1/2 in integer math? Well, it doesn't exist (0.5, or "one half" is NOT an integer), so you run into problems pretty quickly. GF(2^8) has a definition for 1/2, because all addition/subtraction/multiplication/division/logarithms/exponents/square-roots/etc.etc. have a precise solution.
I dunno if its just "how I learned it", but experiments over the GF(5) prime field, followed by the GF(2) then GF(2^x) extension fields has always made the most sense in my brain.
GF(5) is a prime field and is much easier to think about. You have 0, 1, 2, 3, 4 as your numbers (and they're "true numbers", not yet polynomials).
The most natural generator is 2.
* 2^0 == 1 mod 5
* 2^1 == 2 mod 5
* 2^2 == 4 mod 5
* 2^3 == 8 mod 5 == 3
* 2^4 == 2^3 * 2 == 3 * 2 == 6 mod 5 == 1 == 2^0
Because 2^4 == 2^0 == 1, you have your loop. All non-zero elements are created by this sequence (because 2 is an appropriate generator). Define multiplication to be consistent with these numbers (and it happens to line up with normal multiplication).
Extend the cycle over the negative numbers, and you remain consistent:
* 2^0 == 1 mod 5
* 2^-1 == 3 mod 5
* 2^-2 == 4 mod 5
* 2^-3 == 2 mod 5
* 2^-4 == 1 mod 5 == 2^0
By extending into the negative exponents, we've invented division that's 100% consistent with all other math. Notice that 2^-1 == 3, which is 2's inverse.
2^2 == 2^-2, which is its own inverse. Notice that 4 * 4 == 2^2 * 2^2 == 2^4 == 1. Any number times 4 twice equals itself.
Remember, GF(5) is just simple mod-5 arithmetic. We've redefined multiplication to the above attributes, but it works out how you'd expect. Lets take some random examples of multiplicative inverses / division in GF(5), but "extended" over normal integers outside of the modular space to show you this really is amazing and it works.
* 3 * 2 == 6 mod 5 == 1.
* 4 * 2 * 3 == 24 mod 5 == 4 (notice: 2 * 3 is 1, so when 4 * 1 == 4)
* 4 * 4 == 16 mod 5 == 1
* 2 * 4 * 4 == 32 mod 5 == 2 (notice: 4 is equal to 1/4. So 2 * 4/4 == 2 * 4 * 4 == 2)
-----
A similar exercise can be done over addition. Then a similar exercise can be done over distributive property and even polynomials / quadratic equation / cubics and more!
Logarithms, Square Roots. Everything. All the math you ever learned: shrunk down into the exact space of {0, 1, 2, 3, 4} numbers.
--------
GF(5) is awkward for computers however. To make this "reasonable" for computers, we need to convert it into bits and bytes. Shrink the space down to GF(2) (0 and 1), then extend the space into GF(2^8). Extension fields (taking a "power" of the prime) are... complicated. Very complicated.
An extension field is the GF-version of "complex numbers". You invent a polynomial over "x" (or some indeterminate variable). Complex numbers call it "i", electrical engineers call it "j". In the Complex world, i^4 == 1 (the 4th root of 1 is i). But in GF-version, the variable depends on the size of your field. A GF(2^8) will have x^255 == 1... so you have a much larger "loop" so to speak, but otherwise functions very similar to the Complex i.
But hopefully the experiments in GF(5) show why cryptographers love to use GF() fields in general. Its very useful to have all numbers "loop" back into themselves.
Edit: it looks like we've had to ask you this many times. Would you please review https://news.ycombinator.com/newsguidelines.html and take the intended spirit of this site more to heart? I don't want to ban you, but comments like this poison the community here.
heya dang, long time no see. How you doing man? Glad to see you still doing a great job, keep it up. When was last time you slapped me? Must be over a year, right?
OK, promise you this - in 2021 you'll have no worries about me, but c'mon, let me have one slip in 2022, yes? Just to see you're still around ;).