I just go with "some things are easy to do but hard to undo - like breaking a glass of water. Easy to do, hard to undo." At least as a first explanation of what we're trying to achieve in the first place.
Also, I love the "Waldo" example of zero-knowledge proofs: I want to prove to someone that I found Waldo in a "Where's Waldo" puzzle. How do I do this without showing them where Waldo is?
.
.
I can simply cut a "Waldo-sized" hole in a big blanket, then position the "Where's Waldo" puzzle under the blanket with Waldo showing through the cut. Easy to do since I know where Waldo is, but impossible to get information from. (At least, theoretically).
My first eye-opener about 1-way functions is this algorithm:
Take a word.
Remove every second letter.
This is a one-way function, because it's basically impossible to recover the original, even for two letter words. It's not a very good as a one way hash, because you can trivially construct collisions. It does, however, nicely capture the idea that you are pretty much always going to be discarding information to get a hash (if the hash is shorter than the input), without really resorting to the pigeonhole principle.
Although this does give an interesting intuition for a one way function, the application is wrong. Per the article, Alice doesn't want to leak any information about the solution. Simply using a one way function allows Bob to check if one of a possible set of solutions is valid. ( He can just evaluate f on the finite set of answers). It also can leak a bunch of other information, provided it doesn't give the solution in its entirety. For example, f(x) =x^2 mod N is a one way function if factoring is hard. However, it's only hard to learn some log(log(N)) bits of X, the rest is believed to be easy ( and know to be for certain bits).
What Alice wants is a commitment scheme[0] which has some secret randomized input. She would commit to her solution, send it to Bob, wait until Bob has solved the puzzle or gives up, and then prove that she had the solution first by providing both the solution and the randomness. Now you can construct a commitment scheme from a one way function, but a one way function is not a commitment scheme.
Now if they both just want to check if they have the same solution, you have to do far more complex than either of these involving Secure Multi Party Computation.
[0] http://en.wikipedia.org/wiki/Commitment_scheme
Nice example. Of course the challenge is to verify that the one-way function actually yields information about the starting word. As an example, a conceptually similar procedure with Wikipedia usually winds up at philosophy.
Given http://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosophy, I am not sure I would accept THINGS as proof that Alice knew the answer. I know dictionaries probably are structured differently. But they might suffer from the same problem.
You could easily turn a bug into a feature: add a couple of paragraphs to your post pointing out exactly this potential problem, and how you you could test to see whether or not it really is a problem. Your post (which is already terrific) then becomes a great explanation of why not all functions work as one-way functions, and what it means for something to be a good one-way function.
Makes sense. I was aiming for a single 'a ha!' moment for the reader. What I actually want to do is so how well this mapping works with a real dictionary by writing some code to follow this algorithm with real words.
Here's a rough draft that would, I think, only reinforce the 'aha' moment. It could be inserted as the third last paragraph:
"There's a potential problem with this procedure. What if Alice had started with some other word - say 'FILES' - and following this procedure also led to the word 'THINGS'. In fact, it might be that many, many English words all lead to the word 'THINGS'. If that were so then Bob would be justified in feeling skeptical that Alice had solved the crossword. She'd be using a "bad" one-way function. Both Alice and Bob need to be confident that this procedure doesn't lead to too many collision between English words. This is a challenge that needs to be addressed if you want to prove that a function is truly a good one-way function."
I think it is highly likely that many words clash that way. Worse, I expect many clashes between words that fit the crossword description. Consider, for example, a description of 'animal'. Now, look up a few animal:
So, there is some variation, but it isn't hard to find immediate collisions, especially for related words. That makes sense; a dictionary should not attempt to vary the way it describes similar words.
Even if it is illustrative, you should aim to give a good example.
Now, you could make this a good example by continuing, in lesson 2, to show weaknesses with this approach.
For an analogous case, read Knuth. In 'the art of computer programming', chapter ?2? on random numbers, he gives a convoluted random number generator that, in an abstract sense, is not unlike your one-way function, then shows how bad it is, and makes the case that you shouldn't let ordinary programmers design (or even tweak) these algorithms for you.
Knuth's point is that the random number generator that he came up with isn't good and so people shouldn't write such functions themselves. This is different from my blog post which is trying to explain to people what a one way function is without resorting to mathematics.
I do agree that a follow up could be written (e.g. Bob could compute a 'rainbow table' of the dictionary for the next time Alice uses the same trick; and the Alice could introduce some salt; Alice could introduce multiple rounds as well with a 'work factor' to make Bob's life harder).
It was pretty much the perfect explanation for me.
If you're trying to non-mathematically explain a mathematical process, I think, more often than not, you're going to end up with an incomplete example. If you try, I'd wager that you'd probably end up with something fairly long and possibly convoluted.
Which isn't to say that more blog posts about the topic would be wasted. I think the posts so far are great ideas for expansion. I just think it accomplished what it wanted to accomplish.
Hopefully the blog post won't spur too many novice developers into writing home-grown password hashing functions based on the outlined technique.
Good point, although "eventually reaching philosophy" is a lot easier than reaching it after exactly five steps. Also that requires clicking the first link. Still you are right it's a phenomenon that should be ruled out.
That would just be a hash collision. They all have 'em. You not only need to be able to get to THINGS, you need to get to THINGS in an arithmetic sequence of non-article words (first from the first, second from the second, third from the third, and so on). Given that even in an exhaustive dictionary, the definitions are finite in length (and that the hash space is only three words short of the message space), the collision frequency is probably pretty low. In much the same way, I don't need to know your password, I just need to stumble across something that hashes to the same value. (It would likely be useless elsewhere even if you use the same password everywhere, unless the same salt, peppper and hash algo are being used there as well, but it'll still work on that one account.)
If you want to take a closer (and, yes, mathematical) look at this idea, check out the lecture notes [0] from the MIT course Great Ideas in Theoretical Computer Science.
Sections 4 and 5 of the introductory notes [1] give the classic example of using prime factorization to verify bets in a game of "flipping a coin over the telephone". Much of the rest of the course develops the major ideas underlying this example (computational complexity, cryptography) and shows you other things you can do with them (machine learning).
Slightly off-topic, but I've been intrigued by one way functions for quite a while now. How does someone go about making a one way function like MD5 and SHA1? How do you know it can't be reversed easily?
But more commonly (because provably secure crypto functions are extremely hard to design) you think hard about them, then unleash fellow and opposing crypto specialists to try and break them[0], as was done by NIST[1]
It should be noted that we still don't "know": even "provably" secure functions depend on an assumption that may or may not be true, such as that factoring integers is much harder than multiplying them.
Of course, almost everyone thinks these problems are in fact hard (at least for classical computers), but there's no proof (such a proof would imply P != NP).
The assumption needed is even stronger than the complexity-class question: even if you proved integer factorization was not in P, that wouldn't be enough, since you need integer factorization to be almost always hard in every instance of the problem. Or at least you must have some way to identify and exclude the easy-to-solve instances.
Example: SAT-solving is NP-complete, but nonetheless heuristic SAT solvers are very good in practice, which makes its hardness too weak for cryptographic use. That can be ameliorated by trying to identify a more specific subset of "actually hard to solve SAT" (some of the research on the SAT "phase transition" aims at this), but it's pretty difficult. A few problems like integer factorization seem to have just arrived with this apparent always-hard property, but attempts to engineer it have been less successful, hence to my knowledge no used-in-practice cryptosystem is based on taking an NP-hard problem and turning it into a cryptographically useful one-way function (even though Diffie & Hellman suggested that as a research agenda way back in 1976).
There were some attempts to make the knapsack problem into an encryption system. But the early attempts were broken, then patched up and re-broken etc, and nobody has invested enough time into breaking the latest efforts, so cryptographers don't trust it.
"to my knowledge no used-in-practice cryptosystem is based on taking an NP-hard problem and turning it into a cryptographically useful one-way function"
Not used in practice, but such a result was presented by Atjai and Dwork:
An interesting thing about this is that you can transform the integer factorization problem into SAT quite easily (I have code that does it). The resulting SAT problems seem to be hard to solve for heuristic SAT solvers once you get beyond a very small number of bits for the integer sizes.
I'm not very familiar with the literature, but there are some people using SAT solvers in that manner to do a kind of automated cryptanalysis. Here's one paper that encoded the DES procedure into a SAT problem, and looked at how SAT solvers fare on it: http://www.ing.unitn.it/~massacci/papers/mass-marr-00-JAR.pd...
In the case of MD5 and SHA1 the length (of the output) is fixed to something like 160 bits. So the length of the output will always be the same regardless of input length.
There's no way you could take the complete works of shakespeare , pass it through SHA1 and then take the output and somehow reverse it (the 160bits) to get the complete works of shakespeare back out because too much information has been destroyed. It's effectively an extreme form of lossy compression.
What a good hash function should do though is ensure that small changes in the source guarantee a completely different output hash.
This is misleading. With a one-way function, we are interested in making it hard to find any preimage, not just the one you originally thought of. There are lots of functions whose range is small, but where it's easy to find a preimage (eg: take the first 160 bits of input; XOR all 160-bit chunks together; always return 0).
The avalanche effect (referred to in the last sentence) is important as a heuristic. Among other things, it makes it harder to go from an "approximate" preimage to an exact one. If we had f(x) = y, where y is similar to our target y', we shouldn't be able to find x' with f(x') = y' just by looking at the neighborhood of x. But this doesn't rule out more "clever" ways of tweaking x, and it doesn't obviously stop an attacker from deducing some property of x'. So for one-way functions, what we really want to assert is the nonexistence of an algorithm for finding (properties of) preimages, that is any better than just trying lots of new values of x.
> What a good hash function should do though is ensure that small changes in the source guarantee a completely different output hash.
Not always true. For example, see locality sensitive hashing [1] which relies on similar inputs being hashed to similar outputs to quickly look up similar items.
That's where I think the really interesting aspect of hashing algorithms comes from - what the different characteristics are and what applications that has (speed for checksums, slow for passwords, similar inputs giving similar outputs for similarity searching)
[Edit]
The key characteristic of all hashing functions is it produces a fixed size output. The fact this makes a one way function is incidental; though crucial for many applications like password storage, it's not really that important for things like checksums [2] or hash tables.
About your edit: That's not actually a good key characteristic. E.g. in the context of purely functional data structure, the key property is that hash functions are fast to compute and provide keys that are easy to look up, but may have collisions.
I meant the characteristic that makes it a hash function is producing a fixed size output.
It may not be that important for some contexts, but without that characteristic it's not a hashing function.
Conversely, I can have a function that isn't one way, but produces a fixed size output. Granted, it's not going to be that useful, but it's still a hashing function. If I have a one-way function that doesn't produce a fixed size output, it's not a hashing function.
How likely is a collision on two works that match the description "the complete works of shakespeare"? [this question left as an exercise for the reader]
"How does someone go about making a one way function"
It's hard -- cryptographers have yet to even prove that one way functions actually exist! We have lots of theoretical candidates -- discrete logarithms for certain groups, integer factorization, problems related to hidden linear codes, and so forth. Some day, we will either prove that OWFs do not exist, or that OWFs do exist (and hopefully one of the candidates is actually an OWF), or that the existence or non-existence of OWFs is independent of the mathematical systems we use right now (i.e. that it is an axiom).
Having said that, you might be interested in the work of Atjai and Dwork on creating an OWF (a trapdoor OWF, actually, and a corresponding public key cryptosystem) from an NP-Hard problem:
On the question of "can't be reversed easily", the only real way is to try getting lots of clever people to try to break it :)
There are also standard attacking techniques, so you can check (and maybe prove) these techniques do not work, but that still does not show there is a trivial crack you have missed.
It's a very good explanation, but I think it could benefit from a simpler base case, and then the explicit addition of "rounds".
So, the base case is "FOLIO is on page 655(?)" and Alice gives Bob the number 655. Now, there are only 12 five-letter words on page 655, so she's actually giving a pretty good hint away. Instead, she picks the second word of the first line, and give the page of that. Depending on how secure she wants the verification to be, she can repeat this process, with each round making the one-way-function 1000(?) times harder to reverse.
she picks the second word of the first line, and give the page of that
You've lost me. I think OP does it in a way that people don't have to guess at what he's talking about, which is important (very important) when introducing an idea).
There are certain tweaks we can argue about to be sure, but I think this serves as an intro that everybody can "get." She's not securing a $100,000 account. If someone were to get interested in this and pursue more knowledge she would quickly realize that she couldn't use this algorithm to secure important accounts.
Note: I assume you mean that she picks the second word of the first line, looks that word up in the dictionary and gives the page number of that word.
JGC is a better writer than me - I did not suggest the exact language, just the structure of an alternative approach to the explanation that I think would make it simpler.
He is also a better writer than I am. What I was trying to suggest is that picking an example that is easy to clearly write to is also very important for introductory vignettes. Otherwise, you're left trying to explain things that you weren't even trying to explain. (I've done this in class many times and it's a pain in the ass. You end up spending a fair amount of the class off topic.)
I will note here, however, that in this thread OP agrees with you.
While you're taking editing suggestions, there are 2 points that are missing in the narrative:
- Alice and Bob must have identical copies of the dictionary. Different editions could have slightly different definitions. The risk is not large, but it throws off all confidence.
- Alice tells Bob her procedure. Either they do it in real-time together or she tells him the procedure along with the encrypted word. But nowhere in the text as written does Alice communicate the "algorithm" to Bob.
Also, Bob does not know if Alice completed the puzzle until he completes the puzzle (or the solution is published) [actually that one clue, but Alice could've chosen any clue, possibly the last one that Bob finds]. The story made me believe Bob would know if Alice completed the puzzle immediately, so by definition of the problem, before he completed it. Is there a protocol that would allow this? [I think that's where we need 3rd party trusted sources]
* "The existence of such one-way functions is still an open conjecture. In fact, their existence would prove that the complexity classes P and NP are not equal"
* "It is not sufficient to make a function 'lossy' (not one-to-one) to have a one-way function."
* "The existence of a one-way function implies the existence of many other useful concepts, including: [PRNGs, MACs, etc.]"
This talk, http://pyvideo.org/video/1778/crypto-101, given by Laurens Van Houtven at PyCon US 2013, is a fantastic intro to crpytography and he uses the mixing color analogy as one of his examples.
This particular function is easy to reverse though. Using appropriate data structures you can start with the end result and find all possible candidates for the word in polynomial time.
All security can be broken/beaten. The trick is making the evasion process more costly than "proper" acquisition of the contents. In this case, it's easier to just look up all the 5-letter words matching known letters (or otherwise corresponding to other fixed-length overlapping words), to wit just write a crossword solver.
Bob can always just fall back on rubber hose cryptography.
The point of the article was to explain the concept of one-way functions. The objection was to how easy it would be to crack that example. My objection to the objection was that the example is still harder than finding the secret thru "legitimate" means.
I see crosswords as a data problem: given a dictionary of valid words, generate all possible solutions for the given grid. As the number of overlapping words is quite limited, shouldn't take long to run. The only natural-language factor is deciding which solution matches the clues provided. Dang, now I'm gonna have to go write one...
Probably a hash map from words to definitions along with a list of positions within each definition record. I may post an implementation later if nobody beats me to it but I don't have time right now.
This is an example to show people what a one way function looks like. I even say at the bottom of the blog post that you wouldn't do this on a computer. This specific one way function is hard for a human to reverse. Writing a computer program to automate Bob's work is completely missing the point.
An algorithm that takes polynomial time would not be feasible given the size of a dictionary, assuming you're running it on poor old Bob (who is not a computer).
I think I can come up with a simpler explanation of one way function to start with. Lets say the word I want to apply one way function is 'america'. Lets say I assign a number to each character in the password and the number that I assign is equal to the position of that character in English alphabet.
For eg: 'a' would get the value 1, 'b' would get 2, 'c' would get '3' and so on.
My one way function adds the values of each character, in case I apply it to 'america', it would be 1+13+5+18+9+3+1 => 50.
So, if somebody had to find the word from only from resultant sum or value, it would take time to come up with all the possible words.
Some of the possible candidates also include aameric, cmeriaa etc. Clearly in this case there are many possible words that map to a particular value or sum, in the good one way function there are not going to be as many words that map to one value, also a good one way function is more complicated(for a computer too or computer takes non polynomial time) than the one I chose!
The following papers created one-way functions on words for resilient information hiding in text, first one through synonym substitutions, second one through typos:
No, it's not. According to my dictionary search there are two other words (piece and world) with the same word.
And, as you point out, this means that the function is not bijective. However, I am not claiming that it is, I am illustrating a process of going from one word to another that is easy in one direction and hard in the other.
This blog post is not about collision resistance, but it would make a good follow up topic. Especially since some previously thought to be resistant one way functions (MD5 and SHA1) are now known not to be.
I think it's actually a good analogy because of that. When an astute listener asks that question, you have the perfect segue into discussing collisions and why designing cryptographic hash functions is difficult and should be left to experts.
It doesn't matter. The exact point of a one-way function is to go forward from a known answer. It should be very hard (impossible) to go backwards.
As the article mentioned, he could have stopped at 1 round but did 5 just to show the concept.
It matters (in the case of the example), because if I tell you "things", you have no way to know if I got it by starting from folio, or from another word X that gave me the steps (for example):
I don't want to tell you 'folio'. I want to prove to you that I know the answer is 'folio' without telling you the word 'folio'.
If you thought the answer was 'bored' and you followed the steps (which are not a secret) and you didn't come up with the same end confirmation word as I did. (i.e. me = 'things' you = 'hamster') one of us is wrong and neither of us had to provide our starting word to the other. We don't know who is right yet, we just know we don't agree.
One round leaves some room for error because we could both accidentally choose a word that resolves to the same word in the target position. But since we are doing 5 rounds and moving the target word (1st, 2nd, 3rd etc) for each successive round the odds become much better that we won't have an accidental collision.
I'm not quite sure your last point is true (ie "each successive round the odds become much better") although it's more of a gut feeling than anything else. The universe words in a dictionary is the same so every iteration just chooses another random word from the set. Am I missing something?
Agreed. I think it's clear that looking at the backwards tree, it increases the longer the chain gets. Specifically, the preimage of "things" is at least as large as singleton set "alternatives"; the preimage of the preimage of "things" is larger than the preimage of "alternatives", etc.
I'd argue that you might even run into some convergence since some words will be more common than others. I suppose that's why you're always taking the "Nth" word you find but at some point you just run out words in the definition.
That's actually backwards. Since each round is independent of the last, collisions on an intermediate step will propagate forward. Since each round has the same chance of creating a collision, more rounds means higher total odds.
Not only is this procedure very slow (Bob would do better to solve the crossword) it will result in a number of possibilities for the solution to 21D as paths through the dictionary from different words will overlap.
When used in the context of password encryption though, all of these "overlapping" words are also the password.
The question of "Why does it change things if you hash a passwd in the DB?" always lingered in my mind and now it totally makes sense! Awesome and incredibly simple explanation, thanks!
terrible explanation imo since the algorithm is still not clear to me. how about: 3+7 is 10 and 2+8 is also 10. Given the expression you can find the sum, but given the sum it's very hard to find the expression. This becomes harder for bigger number of course.
i know nothing about cryptography, but come on one way function can be explained way better than that. haven't seen such bad writing with so many likes.
Also, I love the "Waldo" example of zero-knowledge proofs: I want to prove to someone that I found Waldo in a "Where's Waldo" puzzle. How do I do this without showing them where Waldo is?
.
.
I can simply cut a "Waldo-sized" hole in a big blanket, then position the "Where's Waldo" puzzle under the blanket with Waldo showing through the cut. Easy to do since I know where Waldo is, but impossible to get information from. (At least, theoretically).