> All the sensible textbook titles were already taken. Actual joy not guaranteed.
Like a lot of people (I imagine) I made it through a CS bachelors program not really ‘getting’ the discrete math combinatorics part. Crypto is an area where those concepts really really matter. It’s great to see this resource available!
For what it’s worth, I had the damndest time with Calculus until I got a good professor. When he explained it, everything clicked. A good teacher makes a massive difference. It might be worth giving it another go.
If the whole of math were represented by the surface of the planet, "discrete math" would be more than half of it. Calculus (i.e. differentiation/integration on the reals) on the other hand, would be a city. Perhaps a very populated city, but just one.
It's a shame that match curriculum for non-math-majors is an all-roads-lead-to-calculus affair. I think we scare a lot of potentially talented people away from math with that approach.
I'm not sure why you think that? Basically all of physics, chemistry, and other physical sciences is calculus. Calculus is the mathematics of rates of change, and basically all physical science is the study of change.
Discrete math is very important to computer science. But in the rest of the world, calculus (and differential equations) dominate.
Maybe we're having a definition difference. My understanding of the term "discrete math" is that it refers to any math whose characters aren't part of a continuum.
Anything to do with numbers besides the reals. Anything to to with finite sets. Anything to do with groups, rings, polynomials, trees, graphs, ordinals, lattices, compass-straightedge-constructions, polygons, knots, categories, sheaves, topological spaces, vector spaces, and anything to do with oddities like map coloring or plane tiling, and a lot else too.
It's a course in "everything else" for students that are being pigeonholed into a mathematical specialization by the fact that it's been fashionable to use real numbers to describe the world since Newton.
I'd have to ask around but I know a few algebraic topologists and I'd bet that if pressed to describe their work as more-continuous or more-discrete they'd first tell you that this is a silly way to classify subfields in math and then they'd says it's probably more on the discrete side since it's all about categorizing topological spaces based on whether they have certain discrete properties:
- separable
- countable
- metric
- compact
Sure the spaces themselves might be continuous, but topology is for telling those spaces apart based on where and how continuity fails. It's a discretization of things formerly suspected to be the the same.
I mean the notion of compactness is inspired pretty clearly from the continuum. I also have a degree in pure math and disagree strongly with your characterization of topology, which is literally defined up to homeomorphism which is itself defined by it's continuity. Arguably the whole point of topology is characterized by that which is preserved by continuous deformation, which clearly is inspired by more continuous math. I think you are using incredibly strong language and in a misguided way.
There's something about real analysis that makes me uncomfortable. I've been struggling for years to put my finger on it. Whatever it is, topology doesn't have it.
I had mistakenly decided that it was an overappreciation of continuity, but I think it must be something else.
Maybe it's the differential structure itself? Or an (over?)emphasis on the study of functions from space to space instead of the study of the space itself? Or the specificity of calculus (the study of one specific space) instead of the generality of topology (thinking about a lot of spaces and comparing them to each other)
Hmm, I'll have to ponder those. Specificity seems closest. Whatever it is, it's not a rational critique. Despite the discomfort, I'm also fascinated by it because one should not have an emotional response to specific types of math, but I very much do.
Something about the homework in Real Analysis left me feeling angry. Not because it was difficult or presented poorly, but because it was somehow... untrustworthy? As if my betters had decided which ideas were the good ones and the only thing left for me to do was optimize along the one dimension that they had assigned me. I realize that this is nonsense, but I can't seem to shake it.
It was especially bad in quantum mechanics where they made us do it the hard way (using calculus) before showing us the easy way (using noncommutative algebra).
"Calculus" is just the most accessible corner of the broader field of Analysis. And a lot of "discrete" mathematics ends up drawing upon analysis (especially complex analysis) as you delve deeper.
How about we play a game where you make a list of mathematical objects which are continuous, and I make a list of ones that aren't.
I'll start with sets and you can start with the real numbers, and we'll see whose list is longer.
To a certain degree the whole game is nonsense, there's a countable infinity of axioms and a countable infinity of theorems that follow therefrom, so we're never going to get anywhere rigorous with this, but just like it's not unreasonable to say that there are more multiples of 2 than there are of 2000, I think it's fair to say that continuity games on the reals represents a relatively small share when compared to mathematics in general.
In my school, to get the CS degree you only need as much calculus as needed to get to the second Physics course, which covers electromagnetism. You don't need a deep understanding of calculus. A lot of applications of calculus in the course can be handled by rote memorization.
I took extra calculus classes from the math department. Those are way harder than what you need for introductory physics.
Not op but for fwiw, in my university you only needed about as much calculus for ComSci as you would learn in IB/AP High school courses with a good prof anyway. Maybe a bit more but it was mostly a repeat.
(I recall one day in 3rd year calculus course realizing "wait a second... I don't need to be here!". I just kept signing up for calculus every year since 11th grade and suddenly realized I don't need, don't want, and don't like the class nor the prof I was in, and that was one pain less I could instill upon myself :)
As others have mentioned 'calculus' can mean one of five or six classes (I, II, III, advanced, diffyQ, probably more), basic differentiation is not hard, most of calc is some tricks a computer can learn or do better.
At least to my knowledge there is no 'theory of calculus' which allows one to solve generic integrals or differentials, calc is as much an unrelated art to comp sci as learning to play chess will help with accounting.
Again I complain on the irritating habit of non-descriptive naming of files.
book.pdf
Really? I'm saving the file for a later review and this name will guarantee I either completely lose the file or will spend more time than necessary to locate it. The other option is to rename the file on saving, which is some work which needs to be performed by each user, instead of being done just once on the source. I think this is just impolite.
Your observation is very valid, that's one of my pet peeves too, but renaming the file before saving it is also very easy. Your observation, while helpful, feels excessively negative.
The only negative adjectives I used were "irritating" and "impolite". The first one is strictly my feelings, and if you don't share them, I can only congratulate you on your self-restrain. The second one is just my opinion, I could explain why I think so, but I don't think it's that big of a deal to waste time on further discussions. Just please, keep file naming in mind if you can. No one will come to any harm if you won't, though, so I guess it's not that important.
On the second thought, one part of my job is preparing products for the final consumption, so this attention to minutiae is probably a professional deformation.
I don't know what standards you hold me to. I don't know how can you point at what you perceive as an issue and not be negative besides not speaking at all.
I know there are a ton of pitfalls with spaces in filenames, but surely this is not one of the cases where it would apply? Unless your file save dialog replaces spaces for some reason, in which case even %20 is about as readable as _ to me.
Myself going from "web designer" to a "zero-knowledge proof cryptographer" (I'm a coauthor of Dalek Bulletproofs implementation, the cleanest API, documentation and fastest ZKP system ever) I'm sad to see crypto textbooks spending most of the time on symmetric ciphers, going through the same nonsense like ECB and then touching asymmetric only with RSA and annoying GCD stuff when everyone moved onto elliptic curves already.
I'd like to see material around modern stuff (Keccak and ECC) with only a cursory discussion of pitfalls of the past standards in contrast with the current stuff. And more time on asymmetric, discrete log and fiat-shamir protocols that are so much fun and require really a 7th grade algebra to grasp.
EDIT: what i mean, is focus first on how to use primitives and what they promise, rather than how they actually work. Once you feel like you can play with these legos, it's more fun to dig deeper and see why ECDLP is hard and how ciphers are made. But please don't start from within - it's an annoying boring mess that distracts from the true beauty: the cool things you can build with these tools.
> what i mean, is focus first on how to use primitives and what they promise
These sorts of "provable security" textbooks (like the one in the article) try and do that. They emphasise less the exact inner workings of algorithms, and more on what they supposedly achieve. Various kinds of security proofs are possible. This school of thought in cryptography does also allow you to argue fairly convincingly that some concrete ciphers are secure. So it can talk about both the inner workings of ciphers as well as what security properties they achieve.
The "provable security" approach seems to be in line with your suggestion. In fact, it was the framework under which Zero Knowledge Proofs were developed. Not every cryptographer knows about that though, so I'm not sure if you were aware...
Mike was my cryptography professor and an advisor on an independent cryptography course I took a an undergraduate at OSU. We used an early copy of this book and it was wonderful. This is a great, free, resource for anybody interested in crypto maths.
For those looking for a more exercise-oriented approach to cryptography, the Cryptopals challenge (https://cryptopals.com/) is an excellent way to discover cryptography step by step with 64 exercises with an increasing difficulty level
This is exactly the kind of book on cryptography I need.
The book uses math & pseudocode to describe algorithms instead of using popular languages like Python or C/C++.
This makes it language agnostic. Code can be very opinionated in my opinion.
This looks excellent. I tried to go through the Introduction to Cryptography by Christof Paar [1] and that material was not really suitable for me, personally. I found it too dense, and written in not very interesting manner.
Thank you for posting this book, I'll give it a go! Crypto is one of those things I have on my shameful "how-come-you-don't-know-it-yet" to-learn list :).
Really? The book might not be suitable to read it from cover to cover in bed, but still approachable. The videos are also really nice and easy going (I took the lecture at his uni, nice foundation).
Compared to this book you seem right, the glimps I had was easily digestable and still concise.
Looks like an excellent book. Would the author prefer the solutions to the exercises not be public? Or would it be welcome for someone to publish them? If the latter, I would quite happily work on it.
Is it available in a format other than PDF? Like HTML or MOBI for example? PDF is very inconvenient to read on small screens like phones or kindle. Thanks!
In case you haven't tried, I suggest using koreader. It has a good combination of a reliable crop and a usable reflow feature. Not ideal, but it's much better than constant zooming and moving the view around.
i'd guess that bots these days harvest domain names and then human names associated with them and then spam tools try common schemes for constructing email addresses. don't know for sure though.
i always thought the hotness would be to encrypt the address in source text and render/decrypt it client side in javascript, but i suppose scrapers these days use full blown headless browsers complete with javascript runtimes.
RFC 5321 requires compliant domains to have both a "postmaster" and an "abuse" email address (e.g. postmaster@joyofcryptography.com). This technique will not stop a flood of spam emails to x@joyofcryptography.com, which will end up in the postmaster inbox.
Mail to the literal 'x@joyofcryptography.com' address would just be dropped. There's no expectation that mail to a nonexistent address would be placed in the postmaster mailbox.
Signal message keys seem to include an IV. [1] Message keys are generated by passing a chain key through a key derivation function. [2] So you can also expand a shared secret established through Diffie-Hellman.
It can also be implied by some other system parameter. As long as the same IV is never reused for the same key, it's safe. For example, if you're encrypting data blobs in some kind of CDN and you never mutate old blobs, your IV can be the blob ID.
That said, depending on the construction you use, nonrandom IVs can be a problem (they can leak that there's a relationship between different plaintexts if they line up with a difference in the first block), so you're better off at least making it a random function of the key and said blob/etc ID.
The IV isn't secret, but in CBC mode it does need to be securely random, otherwise you open yourself up to a class of attacks based on IV prediction. In other modes (CTR, for instance) you can use any nonce you like, so long as you're not reusing it.
The IV (or the nonce, in an AEAD) is usually public, but authenticated; often just appended to the message; with CBC mode, you'd normally prefix the message with the IV, and include the IV in your HMAC.
I read this book as an introduction to cryptography. Very well written. More understandable and easier to follow than any other texts I've read on cryptography.
I've often thought the use of prime numbers to be the weakness in cryptography.
Whilst theory is different in practice due to machine limitations, there are only so many prime numbers a machine can present in a limited timespan restricting the range of primes available to use.
With this in mind, and then knowing what a webserver will typically use by simply browsing the website with different encryption algo's disabled in the browser if its not possible to work out the underlying webserver software and version from a variety of methods like a simple 404 message, decoding the URL or using DPI, further limits the encryption algorithms to spend time reversing when using the replay attack method making it somewhat more targeted. Its still a sort of brute force but a more targeted brute force.
So should I see primes as a weakness in cryptography?
First of all, primes are only used to arrive at a session key, and once you have a session key you're in the land of symmetric algorithms, which provide security by permutations rather than vectoring into prime spaces. The content of a web page does not matter at all in terms of the security being provided. A 404 is just as secure as a valid home page, in terms of cryptography. (Not in terms of application security, but that's a whole different thing.)
Second, the supply of prime numbers is countable but also infinite, and the relation between a number space and the number of primes within it is well established within the workable sizes. We have upper and lower bounds on the number of primes within certain ranges. This is partly why we end up with certain key sizes as being secure and other key sizes as being insufficient. Secure key sizes (in asymmetric algorithms) partly are secure because there are so many primes that can be fodder for key generation.
Not very easy to guess just because the space in question is limited to "1024-bit primes"!
And indeed, that's part of why we use primes that are this big when we still use RSA. They're big enough that the number of possibilities makes them not easy to guess (even given the extra hint of "the secret number you're looking for is one of the factors of this 2048-bit semiprime").
Exponentials are always hard for human intuition. We start with some kind of pattern and it feels like there just aren't that many numbers that would satisfy it. But when you get out to large numbers (even numbers as large as the one above), there just are that many numbers that would satisfy the pattern!
So in a replay attack all the communication is recorded including the session keys, and whilst primes are infinite dont dispute this, where the theory (human) and the practical (cpu's) differs is the limitations in the machine hardware, in much the same way Spectre and Meltdown exploited the hardware to obtain secrets.
The point about the webserver was using methods to work out what the underlying webserver and version was and potentially the OS, after all not much point trying to decrypt an algo that is not used on the webserver, but metadata is given out by many webservers to suggest what type of webserver it is.
And then there is education, we see people believe many things because this is what's taught to them, we see this cognitive dissonance with religious people but also in "educated" people like some GP's or academics because its their life's work and I'm sure cryptography has consumed many hours of people thinking things through, without much time spent on other areas of weakness like the limitations of a cpu and memory registers.
After all how many crypto experts are fully conversant with the inner workings of a CPU design? If they were conversant with CPU designs would Spectre and Meltdown have been found years earlier?
Admittedly side channel attacks but I did say in my original post "Whilst theory is different in practice due to machine limitations" and this is the thing is the current implementation of cryptography on devices suitable when considering the design of CPU's and devices?
Cryptanalysis is a pretty robust area of research. There are in fact a lot of cryptanalysts that know CPUs quite deeply. Certain organizations spend a great deal of money to make it so. If vulnerabilities are found we learn from them and advance the understanding of cryptography design. It has been that way for 76 years now, at least.
So using the basic definition of quantum computing being "a device which can compute all permutations at once", in theory yes it is a 0x73 0x68 0x69 0x74 0x20 0x73 0x74 0x6f 0x72 0x6d but thats because these algo's do not encompass time and a place.
Build time into an algo and then you need a quantum computer which can also roll back or forward time.
You see the problem is, how do you build time into an algorithm like a particle decay? And if you could, could you then make it like a time delayed bank vault which only opens for certain windows during the course of the particle decay?
I also wonder if a place, some co-ordinates could also be built into an algo if that is even possible. There are medieval examples, namely trusted couriers but then you are not looking so much at an algo but more a self contained device to defeat quantum computing.
Anyway for now, you've got the NSA data centre at Bluffdale Utah which can sort of replicate quantum computing by running multiple instances with each using one pwd from all the possible pwd combination to look for the magic bytes in a file https://en.wikipedia.org/wiki/List_of_file_signatures in order to decrypt files. A variation of that Could be used for replay attacks on network communication.
At least thats how I would do it quickly and easily.
That’s not how quantum computers work, but you’re on the right track. And yes, time is almost always an essential part of crypto algorithms (usually proving the order of things, e.g. when using nonces, or temporary keys to block re-generation as with ephemeral keys.)
Not an expert, but I don't think it's necessarily a weakness in the current applications of cryptography. In something like RSA a typical key is >=1024 bits, which is 1.7e308 possible number so even though primes are more "limited" the actual reduction in security (if non-primes could have been used by magic) the primes that are left are still plentiful. One reason RSA is not really used anymore is because other algorithms such as elliptic curve cryptography provides much better efficiency (partially due to it not requiring primes keys). So it's less efficient than more modern technologies, but not necessarily weaker.
RSA is still in wide use in both old and new deployments (e.g. it is still the majority of TLS handshakes).
Elliptic curves are faster and use smaller keys, but they're also a more fragmented ecosystem; plus, in particular for the older NIST curves, there is the unshakable fear that they're backdoored by the NSA (since they use magic unexplainable numbers, which RSA does not).
Thankfully ed25519 gave us an alternative free of magic numbers, and it's seeing a lot of adoption (in particular in open source software), but it's nowhere near taking over RSA.
No cryptography engineer seriously believes the NIST P-curves are backdoored, and they are in widespread use. Ed25519 is a signing scheme; it isn't a replacement for the RSA in classic TLS --- you'd be thinking of Curve25519, its sibling. The benefit of the 25519s isn't "no magic numbers", it's a structure that makes it easy to implement relatively safely. And all these curves work over prime subfields.
This is all 100% pedantry. But the belief that RSA is risky because "prime numbers" is false, and worth pushing back on. There are reasons not to use RSA, but they're not as simple as "we don't trust prime field cryptography".
No magic numbers is certainly one of the advantages of curve25519 and its siblings. The NSA already gave us one backdoored elliptic curve algorithm (Dual EC DRBG); there is no reason to trust them with magic numbers. They may be backdoored or they may not be, but every serious cryptography engineer knows there's no good reason for algorithm constants not to be generated according to public criteria if you aren't hiding anything. Sometimes they're hiding that they picked numbers that made the algorithm stronger against secret attacks they discovered (DES). Sometimes they're hiding a backdoor (Dual EC DRBG). We'd all rather they not hide anything.
Ed25519 is a replacement for RSA in the x.509 WebPKI, which is what I was trying to refer to when I said TLS. Classic TLS (as in non-PFS, the one that also used RSA to encrypt the session secret) is dead and nobody cares about replacing it with anything. There is no public key encryption involved in modern TLS; instead all you need is a signature scheme (for the certificate and for the final server to authenticate itself) and a key exchange scheme. The former can be ed25519. The latter can be curve25519 (specifically, the retroactively named X25519 ECDH key exchange).
My point is precisely that there's no inherent distrust in RSA (and some concerns with NIST EC, both the magic numbers and secure implementation difficulty), which is why we haven't abandoned it yet. There is certainly no inherent issue with prime field cryptography.
Curves in the Web PKI are overwhelmingly NIST P-curves, which, again, are only deeply mistrusted on message boards, and when needed to get the BADA55 paper accepted.
New designs shouldn't use the P-curves, because it's too easy to implement them vulnerably (for all intents and purposes any random string is a workable Curve25519 point, and that's not the case for the P-curves --- you have to do fussy input validation). But that has nothing to do with conspiracy theories about how the curve was generated.
You don't have to take my word for this; you can just read the Koblitz and Menezes paper, which takes this question on in detail.
That the NSA picked the magic DES S-boxes in order to defend against differential cryptanalysis (which they'd discovered first), and that the NSA picked the Dual EC DRBG constants to backdoor it in a NOBUS type operation are both established facts.
Yes, curves in the Web PKI are largely P curves right now, but ed25519 is also standardized.
You're certainly free to trust the NIST curves, but my point still stands: there is no good reason not to pick such constants using a nothing up my sleeve algorithm, and the fact they didn't do that means they either know something we don't, they backdoored it, they wanted people to suspect they backdoored it, or they're idiots and decided to ignore established best practice for no reason. It's entirely possible the answer is the latter, of course :)
No, the logic in your last paragraph doesn't hold at all. You should read the Menezes paper rather than trying to derive this stuff from faulty axioms.
I wouldn't contest that no crypto engineer "seriously believes NIST P-curves are backdoored", but I know some high profile crypto engineers who seriously think and demonstrate how they might be flawed and could have been backdoored since day one. [1] [2]
It's almost impossible to prove they were backdoored, but considering the sensitivity of the subject, I understand why many consider this unknown a reason to distrust NIST P-curves.
Dan Bernstein wrote a paper everyone refers to as BADA55 which essentially suggests that virtually every curve in common use other than his are potentially backdoored, even if they're derived from fundamental mathematical constants (in fact, demonstrating that possibility is the point of the paper). So I'd be careful about using Bernstein as a load-bearing citation for this argument.
Again: Koblitz and Menezes take up this topic in detail.
Backdoored or not, the P-curves (or more specifically the standard algorithm we use for them) are hard to use and easy to misuse.
djb dedicated an entire page listing all the theoretical issues with the P-curves and other elliptic curves[1], but their main weakness in practice is that they are just too prone to bad implementation and misuse.
The most well-known failure has to be the PS3 jailbreak [2]. Sony just failed to implement their RNG (or alternatively copied their RNG code from xkcd #221), which rendered their ECDSA-based crypto completely worthless.
Another famous case is the long list of JWT/JWE libraries which were vulnerable to invalid curve attacks, again completely destroying the security of their NIST p-curves (when used for encryption) [3].
Really, I don't think nobody should be using NIST P-curves if they have any choice, unless you verified your implementation yourself. And I don't even want to claim to be able to do it.
(I don't think tptacek ever said you should use the NIST curves[4], so there's no controversy there)
By cryptography being deployed today I meant new protocols. Like if the people who are actually in the position of picking cryptographic primitives, virtually no one reaches for RSA. Sorry if that wasn’t clear.
To be clear, there is no rational reason not to reach for RSA unless you need the smaller keys of elliptic curves or you are afraid of quantum cryptography and have to avoid elliptic curves as well.
There is a massive performance difference—not just size, but computation as well. Especially if you want a constant-time implementation. There is also an order of magnitude more ways to screw up and not do input validation correctly, thereby introducing vulnerabilities. For all these reasons there aren’t really good libraries for RSA outside of TLS implementations, and that’s a lot of baggage to inherit if you are not doing TLS.
Your own link [2] shows RSA being 10-50x slower than the EC algo they used, which itself is many factors slower than state of the art. Are you reading the table right?
Table 3 (decryption) from link [2] shows that it took 1.265 seconds for 233 bit ECC and 0.031 seconds for 2240 bit RSA to encrypt something. The associated comment was:
>We noted that the encryption by the RSA algorithm is faster than the one with ECC algorithm.
While we are here, I will point out that the performance was almost the same for the same key lengths in the encryption case.
So ECC seems to be faster for key generation but overall slower for everything else.
Those numbers are clearly nonsense; notice how the smaller ones don't even scale with key size, while they obviously should to some extent. That paper is flawed. They obviously made a mistake.
71000 signature verifications per second and 109000 signature generations per second on a quad-core machine. That is much, much faster than RSA.
It's well established that ECC is much faster than RSA in general. I suspect I know what happened. DSA signature schemes require randomness to generate signatures, unlike RSA. They probably were using an entropy-constrained random number generator. That bottlenecked ECDSA through no fault of the algorithm. Ed25519 does not suffer from this issue, since it is constructed in a way that requires no external source of randomness (the random part is substituted with a deterministic hash of the key and input).
If you want to prove that ed25519 is generally faster than RSA for signatures I think you would have to find a benchmark that supports your contention.
That one has ed25519 signing 60x faster than RSA-2048. Verification is a bit slower with ed25519 (RSA is very asymmetric there due to the low hamming weight public exponents), but considering the difference in security level, it basically works out to about the same.
So signature verification is about the same for EC vs RSA, while EC is much faster than RSA for signature generation, and ridiculously faster for key generation. EC certainly isn't significantly slower than RSA for any equivalent operation, and definitely not by 40x like the paper you linked claims. That is just wrong.
It is fairly well accepted that RSA is faster than most curve based things for signature verification and your sources support that?
This is all in response to another user who said:
> There is a massive performance difference—not just size, but computation as well.
... and we don't know exactly what they meant by that, but I was pointing out that it was not that simple...
Asymmetrical performance is normally not very important in most systems due to the use of symmetric stuff to do all the heavy lifting. The exception would be if you were doing some sort of forward secrecy scheme that involved forgetting the private key for each and every message. There the speed of ECDHE is helpful vs the slow key generation of RSA. However, the Signal Protocol shows us we can use a hash ratchet for such short cycle forward secrecy schemes so that key generation speed does not have to be a significant issue.
This is comically false. But for the weird appeal to quantum computers, it's exactly what a Slashdot commenter would have said about elliptic curves in 1998.
Like a lot of people (I imagine) I made it through a CS bachelors program not really ‘getting’ the discrete math combinatorics part. Crypto is an area where those concepts really really matter. It’s great to see this resource available!