Can anyone explain how this works? Seems like total black magic to me... the dude is typing random gibberish and it still figures out the word? Then he's just tapping randomly on the screen, gibberish comes out, then turns into the word he wanted?
Don't mean this to sound unintelligent, the video was just crazy. What kind of key proximity algorithms are behind this?
Patent pending kind? I doubt they are sharing it quite yet but the one beauty of patents is we'll get the entire answer once the patent isn't pending anymore. If I had to imagine though, the algorithm is just accounting for the most likely word to follow the previous wording, that could have been formed by hitting keys that were next to the ones you really hit wrong. I'd like to see how it handles my wanton throwing of swear words in messages to friends. And, the use of l33t speak and text talk.
Example, at 0:45 he swipes to the right and kryuwt turns into letter.
k -> l (k is next to l)
r -> e (r is next to e)
y -> t (y is next to t)
u -> t (u is next to y, which is next to t)
w -> e (w is next to e)
t -> r (t is next to r)
The options are letter, proper, merger and keeper (plus something else further) Given the previous three words "you miss ever" the algorithm uses some prior dictionaries and knows that when talking and saying "you miss every" we get probabilities of of likelihood for each probably word. Letter ends up being 35% as opposed to the 7 percent chance you see any of those other words. ( that last bit of percentage math is an estimation given for the sake of example, in a true text prediction dictionary the percentages are usually small, given the incredibly large amount of words that can and do follow "you miss every" (chance, shot, etc are probably more likely, but not near those letters we kept missing earlier.)
It would seem they've made improvements on that algorithms. I wonder how novel it really is though.
This is pretty much how the stock Android keyboard works. It's been able to spot missed keys since at least Ice Cream Sandwich - e.g. I just typed "kettrt" and it's suggesting "letter" as a correction. Since Jelly Bean it's also had some bigram support too (guesses the next word based on the previous without you typing anything).
Aside from the "typing without a keyboard" part, it's not obvious from the video how Fleksy differs from this, but the focus on blind users probably makes a difference.
But does Android use context clues? iOS can't quite handle the two-letter distance, but it accurately corrects "kryywt" (kryuwt is too far away) without any context.
ah, i forgot about bigrams - that's a really great idea. I wonder if it's packing a huge machine learning library though... wouldn't that take a ton of space? I saw in another comment that it takes a while to start up - maybe that's why
The Levenshtein distance, the standard metric for determining how different words are from each other, can easily be modified to change the penalty for different letter replacements. For example, you can say that the penalty for substituting an 'a' for 's' is 1, and 'a' for 'p' is 10 (or whatever).
It might be an application of forward error correction, using the redundancy inherent in written English. Instead of interpreting pressing the 'F' key as the letter 'F' with 100% probability, one could interpret pressing 'F' as meaning a 60% probability of 'F' and 5% probability for each of 'E', 'D', 'C', 'V', 'B', 'G', 'T', 'R' (the surrounding letters on a QWERTY keyboard). Concatenate successive key pressed to form codewords, then use FEC to find the dictionary word which has the maximum likelihood.
Getting fancy, one could use the results to train the system, fine tuning the probabilities as it goes. In this way, it doesn't matter where you type (or even the spacial relationship of the "keys"), as long as the location you use for each letter averages out to be reasonably consistent. In time the system will train itself to so English words come out, meaning the key mappings are probably correct.
For example, one could start typing "QWERTY" style, change to typing "Dvorak" and the system would put out gibberish for a while, then start putting out English again as the probabilities converged.
This is so thinking-outside-the-box! I'm amazed with that technology. Thanks for sharing it here. If this takes of (it probably should, but I don't know if it will), imagine people in a few years being able to write and read directly with the kind of the symbols that appear at 4:45 in the "Idea"[1] video!
Just a guess but if I were to write it, I would create a table of points that map to a qwerty keyboard then map to a word. Then I would prerender as many words as I can to seed this table. The algorithm would be about probability of shape entered. For example, if I meant to write the word 'hacker' but instead I wrote 'jsvlrt' the shape I've made would map to a similar shape in my database for 'hacker'. Add some creative elements for managing tolerances on shapes entered and you would probably have an impressively close product to this.
My guess is that they use a probabilistic model that, for every tap, incorporates the probability distributions P('a'| X), ..., P('z' | X), X=(x,y) probably a normal distribution. Then assuming they have data on how likely a specific word is, you'd estimate the posterior probability distribution over all the possible words with that many characters. Pick the top most probable words as auto correct options.
I'm sure there's more to it than that, but I assume that's the basic idea.
I think there's a bit of a show here. I'm guessing he went through this demo quite a few times before they filmed this. The prediction algorithm probably starts "learning" which words follow other words based on use (on top of the proximity algorithms). So if you go through this a few times its getting "smarter" and more predictive.
I'm gonna guess that it does a couple things, but the main one is the shape you create with each touch, since you can type without typing on the keyboard itself.
Don't mean this to sound unintelligent, the video was just crazy. What kind of key proximity algorithms are behind this?