Understanding how AES encryption works

tptacek · on Dec 21, 2020

I don't want to snipe but it's a peeve of mine: every article about AES talks about SubBytes and ShiftRows or whatever, which is information you will never, ever use, even if your career takes a turn into cryptography engineering, but nobody talks about block cipher modes, which are basically the most important thing you can know about block cryptography.

SAI_Peregrinus · on Dec 21, 2020

Exactly. This article is titled "Understanding how AES encryption works", but it doesn't describe how AES encryption works! It only describes how AES works.

AES is a block cipher, basically an approximation of a "Pseudorandom Permutation" (PRP).[1]

A PRP can be used for secure (authenticated) encryption, eg in AES-CCM or AES-GCM. PRPs can be used for semi-secure (unauthenticated) encryption, eg AES-CTR or AES-CBC. PRPs can be used to make a Message Authentication Code, eg CMAC. PRPs can be used to make a hash function (no direct example with AES, but Whirlpool uses a modified AES, and one could also use it in a Sponge construction, and Keccak uses an internal PRP). PRPs can be used to make a CSPRNG (random number generator), eg AES-DRBG. PRPs can even be used to make a public-key signature scheme: make a hash function (say, via a Sponge construction) and then make a Merkle hash-based signature scheme! The only common operation I don't know how to make with AES (or another PRP) as the main component is a Key Encapsulation Mechanism.

Calling this an article on how "AES Encryption" works is a bit like writing an article "How Programming in C works" and then describing NAND gates, ALUs, instruction decoding, and other bits of how a CPU works. It's the wrong level entirely given the title.

[1] A PRP is a sort of keyed super-shuffle. I'll use decks of cards for this analogy. A PRP is a bit like shuffling 52 decks of cards (in a deterministic way based on the state of an input deck) and taking the top card from each, then returning the resulting 52 cards as a "deck" of output. There can be repeated cards. A deck of all one card (say, ace of spades) will come out totally different, while a normal shuffle would just get you back 52 ace of spades. The determinism lets you repeat this process by knowing the input deck's state.

RaitoBezarius · on Dec 21, 2020

To be fair, I'm not in cryptography engineering and had to understand all those details for an implementation security review. But I completely agree on the block cipher modes thing, a lot of people don't realize what they are doing and just use whatever sounds good, even if it is ECB and it is not relevant in the context.

arcticbull · on Dec 21, 2020

The example image in the wikipedia section on ECB is pretty damning:

https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation...

pvg · on Dec 21, 2020

Can you say something the circumstances that lead to a security review of something that implements its own AES and that implementation is reviewed by (self-described) nonspecialists?

maqp · on Dec 21, 2020

It's always this: AES-CBC with

1) no authentication whatsoever, or

2) fixed IV, "but it's random, I threw the dices myself!!", or

3) fixed key, no forward secrecy of any kind

dragontamer · on Dec 21, 2020

With modern CPU-cores hitting 2x AES-instructions per clocktick, CBC-mode is going obsolete. CBC-mode cannot perform 2-AES iterations in parallel. You need to use CTR-mode.

CTR-mode however has been largely subsumed by GCM (Galois Counter Mode), and all of a sudden you need to learn Galois fields anyway (which AES is an excellent case-study in GF(2^8)).

------

AVX512 performs 4x AES-instructions per clock tick by the way, to fill up the entire 512-bit register. You're only reaching that parallelism with CTR mode or GCM mode.

cryptXTS · on Dec 21, 2020

cryptsetup -c serpent-xts-plain64 -s 512 -h sha512 -i 123456 luksFormat /dev/ice /path/to/8M/dev/urandom/keyfile

(serpent wasn't chosen by NSA because it's the best, if you want inferior AES, then change serpent to AES.)

SAI_Peregrinus · on Dec 21, 2020

I disagree that serpent is superior to AES. The evidence that it's secure is much flimsier than that for AES. It's seen far less analysis, and suffers from many of the same implementation issues as AES (hard to make both constant time and fast without hardware support, and has no hardware support). I'd have agreed with you 10 years ago, but there have been a lot more (failed) attack attempts on AES in the intervening decade than there have been on Serpent, which increase my confidence in AES.

BillyTheKing · on Dec 21, 2020

yea, I agree - this is being shared a lot on threads about cryptography on HN but, anyway, I'll just share it again, can't harm. The (fantastic) challenges on http://cryptopals.com/ do a pretty great job in explaining and outlining the differences between various cipher modes

jganetsk · on Dec 21, 2020

I disagree.

- There is plenty of talk of block cipher modes. What do you mean "nobody talks about" them?

- Block cipher modes are critically important but a lot of them are fairly straightforward and standard.

- Understanding how to design a pseudorandom permutation is where most of the ingenuity of cryptography is.

- If you are doing cryptanalysis, you definitely want to know about SubBytes and ShiftRows, etc.

shuringai · on Dec 21, 2020

Block ciphers are modelled formally as indexed permutation groups, which you do round-by-round, irrespective of the mode of operation. ShiftRows and SubBytes heavily impact this formal model while the mode of operation is not. Your thinking is too practical and "close to code" and as usual, it makes the false claim that the mathematical rigor is unnecessary in cryptology

Thorrez · on Dec 21, 2020

Why do you think tptacek's thinking "makes the false claim that the mathematical rigor is unnecessary in cryptology"? It doesn't seem to me like either the article or the comment talk about mathematical rigor.

some_furry · on Dec 21, 2020

If anyone is interested in this topic, here's a comparison between the different modes/constructions of AES and other popular ciphers: https://soatok.blog/2020/07/12/comparison-of-symmetric-encry...

kortilla · on Dec 21, 2020

But it’s an article about AES, which is the block cipher. Block cipher modes are important to a larger solution built on AES, but they are orthogonal and not even specific to AES.

SAI_Peregrinus · on Dec 21, 2020

No, this is about "How AES encryption works". Emphasis mine. AES internals are interesting, but they're not specific to how AES encryption works. EG AES-CTR-DRBG is a CSPRNG using AES, not an encryption system. AES-CMAC is a MAC using AES, not an encryption system. AES can be used for lots of non-encryption tasks, so the author should have either dropped "encryption" from the title or included information on what's needed to turn AES from a block cipher into an encryption algorithm.

centimeter · on Dec 21, 2020

The implementation of AES isn't even that interesting; it's not especially clever and doesn't leverage any deep mathematical principles. You can learn a lot more reading about cipher modes, as you mention. See GCM for a very interesting one.

opheliate · on Dec 21, 2020

While we're here, Christof Paar's 'Introduction to Cryptography' lecture series (freely available on YouTube here in English: https://www.youtube.com/channel/UC1usFRN4LCMcfIV7UjHNuQg, originally delivered at Ruhr University Bochum) is a really excellent explanation of the internals of a number of crytographic algorithms. The lecture on Galois Fields (https://www.youtube.com/watch?v=x1v2tX4_dkQ) is especially useful in understanding AES, and explains where some of the operations described in the blog linked here actually come from (i.e: S-Boxes used in AES are derived from inversion in finite fields).

jjice · on Dec 21, 2020

We used Christof Paar's book in my crypto class this past semester. The book was great (much better than my professor) and the explanations of how number theory makes up the majority of the math our modern encryption uses were great. I had never heard of a Galois field until that course, and his text book did a great job explaining what they are and why it makes AES possible.

In general, encryption algorithms are so neat. We take some obscure (at least to me) math topics and we can use them in ways to create these incredible algorithms.

Right now, the most exciting thing in the world of crypto for me is the NIST competition for a post-quantum crypto algorithm. I'm very excited to see what comes out on top.

fatnoah · on Dec 21, 2020

I love the shoutout. As a grad student at WPI somewhere around 2000 or so when AES was first, I took his "Cryptography and Data Security" course and my end of class project was a complete implementation (in C) of Rijndael. (I later found out that others in the class chose to implement portions of the cipher and not the whole thing.) It was definitely one of my favorite grad classes.

baby · on Dec 21, 2020

shameless plug: if you want to learn how it works, implement it! There's a tutorial for that: https://davidwong.fr/blockbreakers/aes.html

maqp · on Dec 21, 2020

That tutorial looks really cool. I'm slightly concerned with the tutorial leaving the implementation details to the user, and the tutorial basically saying:

"Remember: to test your function, you can use the test vectors from the appendix A.1 of the AES standard. "

Bad AES implementations suffer from variety of attacks including cache timing attacks etc. and since you've worked for NCC etc. I'm sure you know a ton about this including about cache timing attacks. The guide isn't ready yet, and given that there's upcoming parts about linear/differential cryptanalysis of AES, it might actually be beneficial to have a weak implementation to attack later to show why AES shouldn't be written from scratch.

While the writing about the attacks is in the works, in your opinion, would it be worth underlining that the implementation the reader has produced should not be used in production, and that the user has only produced an educational textbook version to learn the basics about the cipher?

larusso · on Dec 21, 2020

I don‘t think this tutorials aim is to use this implementation in production. I can understand the security risks but it‘s easier to understand and maybe improve a technology when one takes it apart first.

maqp · on Dec 21, 2020

Agreed. I hope the guide goes as far as to start explaining how AES-NI can be utilized, and how constant time implementations can be written. I think the only improvements we can expect for AES (considering it's standardized) is wider deployment of proper implementations of it.

DavidBuchanan · on Dec 21, 2020

This is my "textbook" implementation of AES128, which follows the FIPS specification as closely as possible (in python): https://github.com/DavidBuchanan314/aes-playground

It also includes from-scratch Galois field arithmetic functions.

xurukefi · on Dec 21, 2020

Looks really nice and educational. I'm a big fan of reeinventing the wheel for understanding certain primitives better. I did the same for ECC. Once it's done it's really liberating because you don't have to consider these things black magic anymore.

blendergeek · on Dec 21, 2020

> I don't want to snipe but it's a peeve of mine: every article about AES talks about SubBytes and ShiftRows or whatever, which is information you will never, ever use, even if your career takes a turn into cryptography engineering, but nobody talks about block cipher modes, which are basically the most important thing you can know about block cryptography.

This article is called, "Understanding how AES works". Are you upset that the author wrote such a paper? This paper isn't intended to explain how to use AES. Rather it is intended to satisfy the readers curiosity about how it works.

SAI_Peregrinus · on Dec 21, 2020

This is called "Understanding How AES Encryption Works", but it doesnt describe that.

AES is a "block cipher", a primitive that approximates a pseudorandom permutation (PRP) and can be used to build a variety of functions. E.g. AEADs like AES-CCM and AES-GCM, encryption like AES-CTR, message authentication codes like CMAC, hash functions like Whirlpool (that uses a modified AES), and random number generators like AES-DRBG. In theory one could even build a public-key signature scheme from AES: use it in a Sponge construction to make a hash function, then use that to construct a hash-based signature scheme. The only thing I don't know how to build using just a PRP is a Key Encapsulation Mechanism.

This article is a bit like writing "How To Program In C" and talking about NAND gates and ALUs and instruction decoders.

bccdee · on Dec 21, 2020

Wait, this is the official Go implementation of AES for when AES-specific hardware instructions aren't available? It's not bitsliced. Doesn't that make it vulnerable to timing attacks?

riobard · on Dec 21, 2020

Is there any software-base implementation of AES that is NOT vulnerable to timing attacks and other side channel weakness and with reasonable performance (to the limit of the hardware)?

arcticbull · on Dec 21, 2020

I was under the impression there were, but I'm no expert. [1]

[1] https://www.iacr.org/archive/ches2009/57470001/57470001.pdf

JoachimSchipper · on Dec 21, 2020

Yes, that's the generic solution. They claim ~7 cycles per byte for 4096-byte blocks on the newest CPU tested, but I don't know what the performance would be on CPUs from 2020. (For context, per https://eprint.iacr.org/2018/392, AES-NI is more than ten times faster.)

Of course, AES-NI is generally preferable if you have it; and people use ChaCha on mobile platforms that don't have AES-NI but do have NEON (or another SIMD).

bccdee · on Dec 24, 2020

Yes, but you have to use bitslicing for the s-boxes.

dragontamer · on Dec 21, 2020

> It involves multiplication operations in a finite field, hence this step is a bit tough to describe. See Wikipedia for more details.

Woah woah woah!!!! MixColumns is incredibly important to understanding AES! I don't think it should be glossed over, or just pointed to Wikipedia (which is... sub-par IMO... as an explanation source).

Lets break things down:

1. Galois Fields / Finite Fields are a special number system. Instead of "choosing better numbers", Mathematicians "choose better addition/multiplies". That's right, you change the definition of addition / multiply to better suit your mathematical needs.

2. All operations in a finite-field self-feed back into the same finite-field... I joke that its a "human centipede" of math because you can just keep feeding yourself the same crap! Addition, Subtraction, Multiplication, Division, Logarithm, Exponent, Square-roots, Cube-Roots, etc. etc. All operations are GUARANTEED to return to the finite field specified. In the case of AES, the 2^8 field (256 "numbers", usually labeled 0 through 255) is chosen. No matter how crazy the math gets, you always return to the finite-field at every step.

2.5 -- Technically, they're not actually numbers... they're polynomials. But because they're represented by 0x00 through 0xFF, you can think of them as numbers with weird add/multiply rules.

2.75 -- Knuth notes that real numbers are just polynomials anyway. 525600 == 5 * 10^5 + 2 * 10^4 + 5 * 10^3 + 6 * 10^2. If you're having issues thinking about "GF polynomials are pretending to be numbers", just think about normal numbers, which always have a polynomial representation. The radix-point / decimal-point is just where the 10^0 is located, and then 10^-1, 10^-2 (etc. etc) move forward. Then, instead of having "10" as a specified radix, the radix is now "x" (the polynomial's variable).

3. Finite Field division is very, very similar to "normal" division. As you may remember from elementary school, division "mixes up the numbers real good". Well, in Finite Field arithmetic, all divisions can be optimized to a multiplication. This matches your elementary-school level thinking: 5/7 is "5 divided by 7", but ALSO "5 times 1/7th" in normal math. The same is true in Finite Fields, EXCEPT 5/7th is actually a number (erm... polynomial) in the 0x00 to 0xFF space. Also 5/7 == 5 * (1/7) == 5 * 7^-1.

3.5 -- The magic of making 5/7 == 5 * 1/7 == 5 * 7^1 is WHY cryptographers use Galois Fields. When the math / arithmetic becomes more important than the numbers themselves, its very natural to just switch to GF-field representation.

4. Well... hold on. We have GF(2^8) "numbers" (erm... 8-bit polynomials) but AES is over 128-bits. Well... GF(2^8) is more efficient to implement in software because you only need a lookup table of size 256. (From a software perspective: you can either make addition or multiplication efficient on computers. The other operation needs a lookup table. Most programmers choose "XOR" to be the efficient add, and then a lookup table for multiply/divide).

4.5 Because we're stuck with GF(2^8) (because it's the mid 90s and GF-instructions don't exist on CPUs yet and you want tiny lookup tables that fit inside of tiny L1 caches of tiny 90s computers), we extend the GF(2^8) == 8-bit by making a 4x4 matrix (128-bits total for the full 4x4 matrix, each column a 32-bit integer).

5. Instead of just doing one or two multiply / divide operations per element, lets "mix up the numbers real good" with a Matrix-multiplication.

6. As you may remember from linear algebra class: the inverse of a matrix doesn't necessarily exist. But Galois Fields make it easier to find matrix-inverses. In particular, division is always possible, so its far easier to find an inverse of a matrix.

6.5 Assume we were using "normal 8-bit integers" instead of GF(2^8), and we have a simple [[1 0] [0 2]] 2x2 Matrix. To invert the matrix, you need to divide by 2, but what is 1/2 in integer math? Well, it doesn't exist (0.5, or "one half" is NOT an integer), so you run into problems pretty quickly. GF(2^8) has a definition for 1/2, because all addition/subtraction/multiplication/division/logarithms/exponents/square-roots/etc.etc. have a precise solution.

some_furry · on Dec 21, 2020

Addendum:

If anyone's curious about Galois Theory, this looks like a good place to start: https://www.gfuzz.de/AES_1.html

dragontamer · on Dec 21, 2020

I dunno if its just "how I learned it", but experiments over the GF(5) prime field, followed by the GF(2) then GF(2^x) extension fields has always made the most sense in my brain.

GF(5) is a prime field and is much easier to think about. You have 0, 1, 2, 3, 4 as your numbers (and they're "true numbers", not yet polynomials).

The most natural generator is 2.

* 2^0 == 1 mod 5

* 2^1 == 2 mod 5

* 2^2 == 4 mod 5

* 2^3 == 8 mod 5 == 3

* 2^4 == 2^3 * 2 == 3 * 2 == 6 mod 5 == 1 == 2^0

Because 2^4 == 2^0 == 1, you have your loop. All non-zero elements are created by this sequence (because 2 is an appropriate generator). Define multiplication to be consistent with these numbers (and it happens to line up with normal multiplication).

For example:

* 2^4 * 2^3 == 1 * 3 == 2^7 == 2^4 * 2^3 == 2^0 * 2^3 == 3.

-------

Extend the cycle over the negative numbers, and you remain consistent:

* 2^0 == 1 mod 5

* 2^-1 == 3 mod 5

* 2^-2 == 4 mod 5

* 2^-3 == 2 mod 5

* 2^-4 == 1 mod 5 == 2^0

By extending into the negative exponents, we've invented division that's 100% consistent with all other math. Notice that 2^-1 == 3, which is 2's inverse.

2^2 == 2^-2, which is its own inverse. Notice that 4 * 4 == 2^2 * 2^2 == 2^4 == 1. Any number times 4 twice equals itself.

Remember, GF(5) is just simple mod-5 arithmetic. We've redefined multiplication to the above attributes, but it works out how you'd expect. Lets take some random examples of multiplicative inverses / division in GF(5), but "extended" over normal integers outside of the modular space to show you this really is amazing and it works.

* 3 * 2 == 6 mod 5 == 1.

* 4 * 2 * 3 == 24 mod 5 == 4 (notice: 2 * 3 is 1, so when 4 * 1 == 4)

* 4 * 4 == 16 mod 5 == 1

* 2 * 4 * 4 == 32 mod 5 == 2 (notice: 4 is equal to 1/4. So 2 * 4/4 == 2 * 4 * 4 == 2)

-----

A similar exercise can be done over addition. Then a similar exercise can be done over distributive property and even polynomials / quadratic equation / cubics and more!

Logarithms, Square Roots. Everything. All the math you ever learned: shrunk down into the exact space of {0, 1, 2, 3, 4} numbers.

--------

GF(5) is awkward for computers however. To make this "reasonable" for computers, we need to convert it into bits and bytes. Shrink the space down to GF(2) (0 and 1), then extend the space into GF(2^8). Extension fields (taking a "power" of the prime) are... complicated. Very complicated.

An extension field is the GF-version of "complex numbers". You invent a polynomial over "x" (or some indeterminate variable). Complex numbers call it "i", electrical engineers call it "j". In the Complex world, i^4 == 1 (the 4th root of 1 is i). But in GF-version, the variable depends on the size of your field. A GF(2^8) will have x^255 == 1... so you have a much larger "loop" so to speak, but otherwise functions very similar to the Complex i.

But hopefully the experiments in GF(5) show why cryptographers love to use GF() fields in general. Its very useful to have all numbers "loop" back into themselves.

unnouinceput · on Dec 21, 2020

Oaw, that was an awful read. I think I prefer pure math, as described in original paper than ...this, whatever was that.

The worse article I've read this month.

dang · on Dec 21, 2020

Ok, but please don't be a jerk on HN.

Edit: it looks like we've had to ask you this many times. Would you please review https://news.ycombinator.com/newsguidelines.html and take the intended spirit of this site more to heart? I don't want to ban you, but comments like this poison the community here.

unnouinceput · on Dec 22, 2020

heya dang, long time no see. How you doing man? Glad to see you still doing a great job, keep it up. When was last time you slapped me? Must be over a year, right?

OK, promise you this - in 2021 you'll have no worries about me, but c'mon, let me have one slip in 2022, yes? Just to see you're still around ;).