The vocabulary tester that came by here in the last week suggests that native english speakers of Hacker News have a vocabulary of about 2^15 words. An IPv6 address could be encoded in 9 words.
If we restrict to a common subset of words known by most people, 2^13 words[1], then we need 10 words.
If we force the stream of words to make grammatical sense then we get something like 12 words plus another 4 low entropy glue words for a 16 word sentence[2].
It is likely that there are more IPv6 addresses than english language haikus, so that is out.[3] Limericks might cover a large enough space.
So, Why?
1) I think reading a sentence to someone over a voice link would be less error prone and take better advantage of people's prewired wetware. Think of diagnosing a problem where someone has a stale DNS lookup from his dopey ISP.
2) Roughly verifying a hash would be easier. You might not remember "Pampas grass wets the friendly carbon against the bound chemical."[4] word perfectly, but you'd probably recognize it as your SSH host signature.
[1] I made up this number.
[2] I made up these numbers as well.
[3] There isn't room in the margin for this calculation.
[4] Only 10 words though, need to get to 16 for a 128 bit key. This came from a random sentence generator site, it didn't want to make longer sentences.
A decade ago, Oren Tirosh published "mnemonicode", which was a selection of 1600 words which were optimized to be (i) internationally recognizable, and (ii) have distinctive pronunciation that would survive a low-quality phone medium and/or a non-native speaker.
I can only find a copy in the wayback machine now:
Nice idea but I don't see this as an easy "disambiguation" tool like we have for the NATO phonetic alphabet[1]. There is even the PGP word list[2] that is not really used it's usually easier to say your fingerprint directly than to use an alternate world list that you need to look up and disambiguate with a phonetic alphabet one more time.
This looks similar to Bubble Babble (look at ssh-keygen -B, or the Wikipedia page). This method's advantage would seem to be that it's feasible to do by hand, but I'm not sure that the chosen set of words would actually reduce errors when read aloud compared to just reading off hex digits.
Bubble Babble has its own set of pronunciation issues, but it
does have checksumming as part of the spec...a big advantage for the suggested use-case.
This method: dem bag:bip nog:kep lip:bep nig:bot dad:kip dug:bap him:hod fum
This reminds me of all the people using vanity IPv6 address spaces. Facebook, BBC, Cisco, Department of Commerce, F5, etc... they all use "hexspeak" to put words or initials in their public IPv6 addresses.
Unnecessary!
IP address are meant for machines. for Human readable, we have DNS. Further, IPv6 Autoconfiguration will remove the need for humans to manually configure networks.
native IPv6 network "Just works", no DHCP hacks needed
I think most people don't realize that a lot of what IPv6 fixes means you don't need to deal with the IP address directly so often.
Also, any program that converts an IP address to an integer (or goes the other way) would require a new look-up table. Which I guess wouldn't be that bad but it seems unnecessary to me.
It's a neat hack, but I think it's creates almost as much trouble as it solves. Especially since reading off an unabbreviated address would require 16 words (yes, it is easier, but not drastically so). Do people really want to read off "dem bag:bip nog:kep lip:bep nig:bot dad:kip dug:bap him:hod fum"? It's also very difficult to audibly differentiate between some of these words (e.g. "mom", "nom" or "mug" "nug"). The only use I see for this is reading off an IP address, so I think that would need to be fixed. It's also not easy to memorize (I, personally, don't think I could memorize a nonsensical 16-word sequence). It'd take me less time to get DNS working. I think a 128 bit number is rather "unfriendly" no matter how you dress it up.
this is interesting. i'm in the middle of developing a service that will provide graphical hashes for information like this. the idea is sketched out here - http://www.acooke.org/hash-icons.html - and i am about 75% of the way through coding an appengine service. if anyone is interested, please email me at andrew@acooke.org (i'm thinking of charging a dollar a month to cover basic costs; first month free to try things out).
i just meant that rather than using the hash as a direct source of randomness, and generating the image from that, it is easier to use the hash to seed a random number generator and use the random numbers from the generator.
maybe it's obvious, but at first i was trying to use the hash directly, and kept being constrained by the amount of available data.
but if you do that in a security-aware context (originally this was for hashing user names, so security was not an issue) you need to worry about how much state the random number generator has. it has to be large enough to cover the minimum of (all distinguishable images, original hash). so if you had a random number with a byte of state, then that would be no good (unless the image was a single grey-scale pixel).
> i was trying to use the hash directly, and kept being constrained by the amount of available data.
Too much data from the hash? It seems like a 256-bit SHA-2 hash could map 1:1 onto a 16x16 pixel icon, giving you 24-bits of color depth per pixel to encode each hash bit in a non-ugly way.
My dear friend, if you still didn't noticed : there is a people in the world, wo can't speak english or their mother tounge isn't english and for your suprise they are majority. Claiming this list a "pronounceable" is not only nonsense, it's also recklessness. I can't imagine a chinese tries to pronounce "rad" or "lad" : most of them can't pronunce right because of their mother tounge ( and that' perfectly OK )
If we use this list instead of numeric addresses, we all f*cked up...
I think I'm weird. The first thing I thought of when I saw this was, Hey that's what the Judoon enforcers on Dr. who were speaking. It was digits they were transferring back and forth when they say Toe-Noe-Fro-Joe-Low-Toe. And the stuff at the end is just a CRC check to assure validity of the previous conversation.
If we restrict to a common subset of words known by most people, 2^13 words[1], then we need 10 words.
If we force the stream of words to make grammatical sense then we get something like 12 words plus another 4 low entropy glue words for a 16 word sentence[2].
It is likely that there are more IPv6 addresses than english language haikus, so that is out.[3] Limericks might cover a large enough space.
So, Why?
1) I think reading a sentence to someone over a voice link would be less error prone and take better advantage of people's prewired wetware. Think of diagnosing a problem where someone has a stale DNS lookup from his dopey ISP.
2) Roughly verifying a hash would be easier. You might not remember "Pampas grass wets the friendly carbon against the bound chemical."[4] word perfectly, but you'd probably recognize it as your SSH host signature.
[1] I made up this number.
[2] I made up these numbers as well.
[3] There isn't room in the margin for this calculation.
[4] Only 10 words though, need to get to 16 for a 128 bit key. This came from a random sentence generator site, it didn't want to make longer sentences.