Pronounceable IPv6 addresses, WPA2-PSK and hashes

jws · on July 27, 2011

The vocabulary tester that came by here in the last week suggests that native english speakers of Hacker News have a vocabulary of about 2^15 words. An IPv6 address could be encoded in 9 words.

If we restrict to a common subset of words known by most people, 2^13 words[1], then we need 10 words.

If we force the stream of words to make grammatical sense then we get something like 12 words plus another 4 low entropy glue words for a 16 word sentence[2].

It is likely that there are more IPv6 addresses than english language haikus, so that is out.[3] Limericks might cover a large enough space.

So, Why?

1) I think reading a sentence to someone over a voice link would be less error prone and take better advantage of people's prewired wetware. Think of diagnosing a problem where someone has a stale DNS lookup from his dopey ISP.

2) Roughly verifying a hash would be easier. You might not remember "Pampas grass wets the friendly carbon against the bound chemical."[4] word perfectly, but you'd probably recognize it as your SSH host signature.

[1] I made up this number.

[2] I made up these numbers as well.

[3] There isn't room in the margin for this calculation.

[4] Only 10 words though, need to get to 16 for a 128 bit key. This came from a random sentence generator site, it didn't want to make longer sentences.

jimktrains2 · on July 27, 2011

You'd have to remove words that sound alike.

beagle3 · on July 27, 2011

A decade ago, Oren Tirosh published "mnemonicode", which was a selection of 1600 words which were optimized to be (i) internationally recognizable, and (ii) have distinctive pronunciation that would survive a low-quality phone medium and/or a non-native speaker.

I can only find a copy in the wayback machine now:

http://web.archive.org/web/20051109233255/http://www.tothink...

http://web.archive.org/web/20051109230247/http://www.tothink...

adulau · on July 27, 2011

Nice idea but I don't see this as an easy "disambiguation" tool like we have for the NATO phonetic alphabet[1]. There is even the PGP word list[2] that is not really used it's usually easier to say your fingerprint directly than to use an alternate world list that you need to look up and disambiguate with a phonetic alphabet one more time.

[1] http://en.wikipedia.org/wiki/NATO_phonetic_alphabet [2] http://en.wikipedia.org/wiki/PGP_word_list

pataprogramming · on July 27, 2011

This looks similar to Bubble Babble (look at ssh-keygen -B, or the Wikipedia page). This method's advantage would seem to be that it's feasible to do by hand, but I'm not sure that the chosen set of words would actually reduce errors when read aloud compared to just reading off hex digits.

Bubble Babble has its own set of pronunciation issues, but it does have checksumming as part of the spec...a big advantage for the suggested use-case.

This method: dem bag:bip nog:kep lip:bep nig:bot dad:kip dug:bap him:hod fum

Bubble Babble: xemab-cifor-mycup-fydet-fugic-nadid-vabel-bisog-maxox'

rickette · on July 27, 2011

Didn't we invent DNS for this issue?

But seriously, I can see the need for something like this. But the proposed three letter words don't really help. It's still too hard to pronounce.

joejohnson · on July 27, 2011

Here is a more popular list of words for the same purpose: http://en.wikipedia.org/wiki/PGP_word_list

thorax · on July 27, 2011

This reminds me of all the people using vanity IPv6 address spaces. Facebook, BBC, Cisco, Department of Commerce, F5, etc... they all use "hexspeak" to put words or initials in their public IPv6 addresses.

gary4gar · on July 27, 2011

Unnecessary! IP address are meant for machines. for Human readable, we have DNS. Further, IPv6 Autoconfiguration will remove the need for humans to manually configure networks.

native IPv6 network "Just works", no DHCP hacks needed

zbisch · on July 27, 2011

I think most people don't realize that a lot of what IPv6 fixes means you don't need to deal with the IP address directly so often.

Also, any program that converts an IP address to an integer (or goes the other way) would require a new look-up table. Which I guess wouldn't be that bad but it seems unnecessary to me.

It's a neat hack, but I think it's creates almost as much trouble as it solves. Especially since reading off an unabbreviated address would require 16 words (yes, it is easier, but not drastically so). Do people really want to read off "dem bag:bip nog:kep lip:bep nig:bot dad:kip dug:bap him:hod fum"? It's also very difficult to audibly differentiate between some of these words (e.g. "mom", "nom" or "mug" "nug"). The only use I see for this is reading off an IP address, so I think that would need to be fixed. It's also not easy to memorize (I, personally, don't think I could memorize a nonsensical 16-word sequence). It'd take me less time to get DNS working. I think a 128 bit number is rather "unfriendly" no matter how you dress it up.

CrLf · on July 27, 2011

Yes, because machines fix themselves...

andrewcooke · on July 27, 2011

this is interesting. i'm in the middle of developing a service that will provide graphical hashes for information like this. the idea is sketched out here - http://www.acooke.org/hash-icons.html - and i am about 75% of the way through coding an appengine service. if anyone is interested, please email me at andrew@acooke.org (i'm thinking of charging a dollar a month to cover basic costs; first month free to try things out).

cpeterso · on July 27, 2011

Pretty icons. I like the favicon-like colors and aesthetic better than the geometric identicon/gravatar badges.

Do you mind elaborating on this comment from your hash icon site?

- Using the hash to seed a PRNG decouples pattern generation from precise details about the amount of available state.

Are you trying to avoid embedding the hash (steganographically) within the hash icon's pixels?

andrewcooke · on July 27, 2011

no, no - nothing that clever.

i just meant that rather than using the hash as a direct source of randomness, and generating the image from that, it is easier to use the hash to seed a random number generator and use the random numbers from the generator.

maybe it's obvious, but at first i was trying to use the hash directly, and kept being constrained by the amount of available data.

but if you do that in a security-aware context (originally this was for hashing user names, so security was not an issue) you need to worry about how much state the random number generator has. it has to be large enough to cover the minimum of (all distinguishable images, original hash). so if you had a random number with a byte of state, then that would be no good (unless the image was a single grey-scale pixel).

cpeterso · on July 27, 2011

> i was trying to use the hash directly, and kept being constrained by the amount of available data.

Too much data from the hash? It seems like a 256-bit SHA-2 hash could map 1:1 onto a 16x16 pixel icon, giving you 24-bits of color depth per pixel to encode each hash bit in a non-ugly way.

andrewcooke · on July 28, 2011

the approach i use gives the "same" image, even when pixel sizes and numbers change.

devmach · on July 27, 2011

I'm normally not that rude but in this case :

My dear friend, if you still didn't noticed : there is a people in the world, wo can't speak english or their mother tounge isn't english and for your suprise they are majority. Claiming this list a "pronounceable" is not only nonsense, it's also recklessness. I can't imagine a chinese tries to pronounce "rad" or "lad" : most of them can't pronunce right because of their mother tounge ( and that' perfectly OK )

If we use this list instead of numeric addresses, we all f*cked up...

tptacek · on July 27, 2011

This isn't exactly a new idea; S/Key used the same approach to creating memorable hashes:

http://en.wikipedia.org/wiki/Skey

max2grand · on July 27, 2011

I think I'm weird. The first thing I thought of when I saw this was, Hey that's what the Judoon enforcers on Dr. who were speaking. It was digits they were transferring back and forth when they say Toe-Noe-Fro-Joe-Low-Toe. And the stuff at the end is just a CRC check to assure validity of the previous conversation.

kbutler · on July 27, 2011

With 256 short but non-mnemonic words mapping to the numbers 0x00-0xff

> a slightly tongue-in-cheek proposal

the tongue-in-cheek quotient is "slightly" understated.

jgrahamc · on July 27, 2011

I am British.

erikb · on July 27, 2011

I think the idea is great. There are way too many things about IT that are just not really usable for 'normal' people out there.

sgentle · on July 28, 2011

One problem worth thinking about is unless you very carefully pick your words, there's a lot of potential for accidental offensiveness.

I'd hate to have to explain things to a user with 0b2f:3354 in their address.