Hacker News new | past | comments | ask | show | jobs | submit login
FixPhrase – open-source, patent-free what3words alternative (fixphrase.com)
90 points by nodoodles on June 21, 2022 | hide | past | favorite | 68 comments



This is a bit like my http://wherewords.id that I made as a fun holiday project. I used the S2 cells (same as the one used by pokemon go), and my own wordlist. Using this approach, I believe I got accuracy down to squares of approximately 2x2 metres with 4 words (from a wordlist of 4096 words). FixPhrase only claims accuracy of 11m, which isn't terrible, but won't locate a single parking space or front door particularly well.

The wordlist is surprisingly hard work. The first location I clicked on fixphrase had as one of its words 'french'. That's potentially pretty confusing. It's super hard to get a good wordlist, and it's not just negative words, words that are particularly unusual or words likely to create bad combinations, it's also removing homonyms, words likely to be confused (capital/capitol, carless/careless), geographic words, or words that are combinations of other words in the word list.


Why not a much shorter list with adjectives? I think the last time I looked at this, a few disjoint lists of 256 words, plus a few 16 word adjective lists gets down to ~1m accuracy. Dropping adjectives just reduces accuracy.

> big red plate, small fuzzy ball, wavy green brick

With three disjoint short lists, and two disjoint adjective lists you get permutation robustness.


One word from a list of 4096 words is 12 bits. Four such words would be 48 bits.

"A few disjoint lists" are the same thing as one big list if they are used identically in your generation process. There is no difference between "flip a coin, and pick from this 256-word list or that 256-word list according to the result" and "pick from this 512-word list".

If your system is to pick from a 16-word list of adjectives, then another 16-word list of adjectives, and then from one of three 256-word lists of nouns, you are generating codes of 17.6 bits, which is a bit of a downgrade from 48 bits.


This is a really interesting idea. The big question is whether needing to use so many more words is worth it. If your adjectives are simple and memorable enough, then maybe it is.


If their accuracy is 11m, can we calculate their wordlist size?


Their wordlist is 7610 words long and is here https://source.netsyms.com/Netsyms/fixphrase.com/src/branch/...

I talk a little about my own wordlist choice here http://wherewords.id/+about my word list is here https://github.com/kybernetikos/wherewords/blob/main/lib/wor...


neither this nor w3w are as useful as googles already open source https://maps.google.com/pluscodes/ in my opinion.

With plus codes you can both have a short, memorable address and gauge relative distance with other nearby addresses. I'm not sure I can think of a reason to ever use fixphrase or w3w as an alternative to this already existing open standard.


I had to do a double take on the "Powered by Plus Codes" section of that website as one of the pictures they used there shows a girl holding some kind of document in front of a plus code labelled house. Her shirt reads "It's a all a big lie". I thought that was hilarious.


A large version of that image: https://storage.googleapis.com/gweb-uniblog-publish-prod/ima...

The document appears to be a savings account "passbook" from a post office.


The point of using words is that it should be more reliable when roundtripping via voice or memory. I think it's much easier for people to remember 'reader giraffe suppose advance' than WF24+VMR


w3w fails at this, because it uses words that sound similar (recede reseed, innocence innocents, and many others: https://cybergibbons.com/security-2/why-what3words-is-not-su...). Good luck playing spelling bee of "clairvoyants" in emergency.

Some words may also be difficult to pronounce/hear/spell by non-native speakers. Unlike regular sentences, there's no context to disambiguate.


Yeah, I agree. I think that what3words hasn't spent enough effort on this, or perhaps is suffering from trying to cram everywhere into 3 words, which means the wordlist needs to be unmanageably large.

Even for my attempt at the problem, I did various experiments on the word list, but an ideal attempt would check for similarity across common accents, etc and I certainly wasn't able to do that.

Having said that, I think it's a valid and realistic goal for good word encoder systems to aim for good roundtripability via voice or memory.


The problem with any system like this is that for it to be useful you can't change it after its launched. Any problems with the word list or location allocation are permanently baked in.


That's not necessarily true. You can make sure your decoding system understands both the new wordlist and the old wordlist but only gives you new encodings based on the new wordlist.


Well easy they should use German.

More seriously, English is such a terrible language for this, because it's so full of ambiguities.


> More seriously, English is such a terrible language for this, because it's so full of ambiguities.

English is no more prone to the problem of "some words sound exactly the same as other words" than any other language.


English has suffered the Great Vowel Shift, and has a wide variety of accents. It's a language that has the Spelling Bee, -teen/-ty numbers, -ough suffix, and "ghoti". There are many languages with much more regular, phonetic spelling and smaller variation of accents.


> It's a language that has the Spelling Bee, -teen/-ty numbers, -ough suffix, and "ghoti".

For the second point, -teen/-ty numbers sound different, are spelled differently, and mean different things. How are they supposed to support your point?

For the fourth, it's just false; "ghoti" in the pronunciation /fɪʃ/ does not come close to being valid written English. There is no such thing as syllable-initial "gh" /f/ or syllable-final "ti" /ʃ/.

The -ough suffix is a real case of one sound diverging into two sounds, but that is obviously not relevant to the problem of determining, from the sound of a word, which word you just heard. It comes up in the opposite problem of determining how to pronounce a word from the spelling, which we aren't talking about here.

The spelling bee is a cultural artifact; every language whose writing system is not extremely recent exhibits the phenomenon that the spelling of a word cannot be predicted from its sound. (In China, where spelling is much, much tougher, they don't have spelling bees. They do have traditional dictation exercises.)

You might find this wikipedia article interesting: https://es.wikipedia.org/wiki/Homofon%C3%ADa


welcheswort.com would be a good domain for the german version ;)


The problem with plus codes it they're neither open nor codes. Try taking the alleged decoding algorithm and decode this one straight from Google Maps: "7W87+RRX Odesan, Sør-Sudan"

Did you notice how you can't "decode" the code without looking it up on Google Maps? That's not a location code, that's just using Google Maps.


w3w considers the disambiguity an asset. Almost like a check sum. If you enter an address, and it's in the middle of the Pacific, you know you wrote it down wrong.


They claim disambiguity as an asset. It turns out it's not that hard to find ambiguous pairs close enough to be problematic.


Well, you can find CRC collisions too without much work. Doesn't mean they are useless though. As long as they are guarding against error and not malice.


that's an interesting point, but in practice I don't see how it's meaningful. If you write down a plus code and end up in a similar area, you contact the person with the address and figure it out without much issue. If you can't contact that person again, well you're at least most likely to be in the area and can ask around for directions.

with w3w if you can't contact the person with the address again, you've no idea where on the planet this place might be.


I use this for Uber in the third world. Really helpful.


It looks like the key difference between this and what3words is that squares near each other have mostly the same words. With only the last word changing for adjacent squares, and even then they are similar. I can see the motivation for this (you can abbreviate to fewer words for a general area), but alto suspect it is partially about the W3w patent. However it also increases the risk of being slightly wrong with a location, w3w is good for things like emergency rescue as you can’t be slightly wrong.


"However, security researcher Andrew Tierney demonstrated in 2021 that the What3words algorithm does not sufficiently protect against confusion between nearby locations because it may assign words that are similarly spelled or pronounced, which can limit the value of the system when a precise and unambiguous location is required, like safety-critical applications. Analysis by Tierney showed that close repetitions and the use of plurals occur in physically close locations. The company says that this has a one in 2.5 million chance of occurrence, but Tierney's analysis has highlighted areas where the odds are around 1 in 500."

https://en.m.wikipedia.org/wiki/What3words#Criticism


Why would it increase the risk of being slightly wrong? If anything, it helps to have consistent prefixes because you make it more likely that someone will be able to recognise areas, or know when places are near to other places.

If the system allows it, you can also use fewer words to target a bigger area. For example https://wherewords.id/juniper/detailed/ is an area of Paris, while https://wherewords.id/juniper/detailed/rate/thunder is a specific point in the Gare du Nord. Or if you're standing in Paris, talking to someone else in Paris, you can use context and drop the 'juniper'.

The only real reason I think it can be good to avoid a hierarchy is because having one makes the sensitivity of the word list much more significant. For example, if an entire country has a negative association word like 'stingy' or 'lying' in its first word, that could be a significant problem.

If you really need a checksum, https://wherewords.id/ supports an optional emoji checksum.


The risk comes from mishearing someone. Especially relevant in Europe with all the different accents.

With w3w, Gare Du Nord is sunshine.frame.acted while sunshine.frames.acted is Abu Dhabi and sunshine.frame.actor is in Malaysia.

https://what3words.com/sunshine.frame.acted https://what3words.com/sunshine.frames.acted https://what3words.com/sunshine.frames.actor

While in wherewords.id, changing rate to fate or late or gate still produces Paris, but wrong location.

https://wherewords.id/juniper/detailed/rate/thunder https://wherewords.id/juniper/detailed/fate/thunder https://wherewords.id/juniper/detailed/late/thunder

Having an accurate location is important for emergency services. If you’re on a phone call trying to get an ambulance for someone having a seizure, or reporting a fire, shooting, whatever, it’s important to get the accurate location straight away.

If the call centre person misheard your location, but the code is still in Paris, they will think it’s correct and dispatch to the wrong location. It would take too much time after realising the mistake to get the correct location. So with w3w it is far more obvious when these issues happen as suddenly the map is showing as Middle East or Asia, not Paris!

I don’t think an emoji checksum would help here either. Wink. Was that a tongue wink or smirk or etc.

(Full disclaimer: I don’t see the point in w3w either. It assumes people are prepared in advance to have the app on their phone, otherwise if they need to download it/visit its site they have internet so there are better ways of getting the location)


I still prefer the benefits that come with the hierarchical approach. You can learn where things are, if you're wrong and didn't know it in advance, you're still close to where you need to be, you can get a sense of how far away things are from each other even without looking them up or being online, and you can use context to reduce the amount of information you need to remember / communicate (e.g. say you're communicating 4 locations close to each other, most likely you can do that with 3+4 words with a hierarchical system).

The benefits of a random allocation vanish if you are building an application where Abu Dhabi, Malaysia and Paris are all reasonable answers. The w3w case is particularly bad because their wordlist is enormous and has so many ways you can confuse things.

I still think that a hierarchical system with an optional checkword/digit/emoji is the best solution to situations where you want to be 100% sure you got it right first time, but I accept that maybe the emoji is a bit too cute, and a normal word or number might be better.

I think the endgame for these types of systems doesn't need to assume online usage. It'd be fantastically useful in vehicle GPS systems for example, especially in countries with poor addressing.


There are many word combination switches in close proximity in What3Words too. And if you consider sound-alikes it is worse.

What3Words is very much unsuitable for emergency situations and has been in the media for that several times.


Hah, the location of my house is interesting - that'll be fun asking for a pizza or a fire truck sent to "corrosive filth ....."


Would be helpful if it accepted UK versions of the words - my current location includes "stylized" but replacing that by "stylised" (as you might well after hearing it on the phone) fails and just returns "London, vaguely".

(cf https://news.ycombinator.com/item?id=31830437 )


Or flour, flower, floor...

GPS coordinates are better.


All the wordlists I know for these kinds of applications remove trivial homophones.

Some wordlist use words that are uniquely identifiable after some set prefix length (e.g. 4 characters) or use metaphone codes so you can type anything that sounds roughly right (e.g. keewee is the same as kiwi).


Do they? Flower, flour, ants, hence, its, hits, tits, lead, led, lead, let, lit. It's especially hard in a language where spelling doesn't always suggest the wright pronunciation: read and read, lead and lead.

Sometimes, the combination of words can also be confusing, because you can't tell where the words end. before.head, bee.forehead.

Another issue I remember was that the plural version pointed to a different place altogether.

I used w3w (the triwords are great drawing prompts), and I had to repeat them multiple times to my friend drawing across the table. I didn't remember them after drawing them for a few minutes.


Ah yes, you're right w3w has an enormous wordlist which isn't great.

The metaphone wordlist I was talking about is verbal-id https://github.com/bandrews/verbal-id#readme which shouldn't suffer from the problems you mention.

    > verbalid.parse("vacant brand orchestra kiwi")
    '8aab9b999'
    > verbalid.parse("vaycant brahnd orchistra keewee")
    '8aab9b999'
My own wordlist has 'flower' but not 'flour', neither 'ants' nor 'hence', none of 'its', 'hits' or 'tits', neither 'lead' nor 'led', and 'let' but not 'lit', precisely because of the problems you mentioned. I went through my wordlist automatically first of all (with soundex filtering), then manually afterwards, trying to spot all of these problems and removing them.


> My own wordlist has 'flower' but not 'flour',

But does it accept 'flour' in lieu of 'flower' (or indeed 'floor' since I could well understand someone getting that from a slightly garbled / heavily accented phone call...)?


There's no reason why it couldn't, but the UI as I have it at the moment is a drop down so when you type flo, you'd see that flour isn't an option but flower is.


That's really cool! You put more work into this than w3w for sure


  0, 0         catatonic magnetism sandworm   swimsuit  "Null Island"
  90, 0        abacus    magnetism sandworm   swimsuit  "South Pole"
  -90, 0                                                "North Pole"
  85.0511, 0   detonator magnetism snowboard  theft     "South Limit"
  -85.0511, 0  activity  magnetism smudge     vicinity  "North Limit"
  0, -90       catatonic gloater   sandworm   swimsuit  "Easter Island"
  0, 90        catatonic pogo      sandworm   swimsuit  "Indian Ocean"
  0, 180       catatonic sandblast sandworm   swimsuit  "East Limit"
  0, -180      catatonic driver    sandworm   swimsuit  "West Limit"
Latitude is limited to ±85.0511 in OSM and Google Maps. Could someone request the words for "90, 0" and "-90, 0" manually? Second line above is glitchy but seems to be off-map South, at the pole, as expected.


You can use "FixPhrase.encode(lat, lon)" in the browser console.

    0,    0    catatonic magnetism sandworm  swimsuit
   90,    0    dimmed    magnetism sandworm  swimsuit
  -90,    0    abacus    magnetism sandworm  swimsuit
    0,  -90    catatonic gloater   sandworm  swimsuit
    0,   90    catatonic pogo      sandworm  swimsuit
    0,  180    catatonic sandblast sandworm  swimsuit
    0, -180    catatonic driver    sandworm  swimsuit


Related:

Cybergibbons: Why What3Words is not suitable for safety critical applications (2021)[1]

HN discussion here[2].

[1] https://cybergibbons.com/security-2/why-what3words-is-not-su...

[2] https://news.ycombinator.com/item?id=27058271


I picked a random location and got "daringly kleenex sloppily very". Are there any issues with the fact that "kleenex" is a registered trademark?


Kleenex is widely viewed as a genericized trademark and appears as an English word in the Merriam-Webster and Oxford dictionaries, which this may use as a source.


So, What3Words is marketing and advertising heavily in India and I have to kinda chuckle at the complexities of the words that are not as common as we speak in everyday English. Try saying "interacts.scrapped.evoked" to an Uber driver and he would cancel you instantly.


What stops W3W legal team shutting down this ohe like they did to the other open source implementation?


Hi, I'm the guy who made FixPhrase. Their (100% invalid anyways) patent has a convoluted process to convert coordinates to/from words. FixPhrase just takes the coordinates, chops them up into four smaller numbers, then uses those numbers to look up the words in the corresponding (numbered) list.

W3W doesn't have a patent on looking up words by array index.


I would also like to know why this isn't considered patent infringement.

https://patents.justia.com/assignee/what3words-limited

---

I've poked around the website. They claim an "open-source, patent-free algorithm" is used, and therefore the entire concept doesn't infringe on W3W's patents. Yikes. They can expect letters and lawyers.

https://source.netsyms.com/Netsyms/fixphrase.com/wiki/How-It...


The fact that W3W managed to get their patent issued is pretty hilarious in light of this: https://patents.stackexchange.com/questions/13629/i-had-inve...


The work behind what3words isn't the algorithm. It's the word list used. They have linguistics who check the sounds of words, are they too similar to another word are they rude, etc, etc..


It'd be interesting to apply a Hilbert curve [0] to the Earth with the PGP Word List, with the even/odd list being used for each hemisphere, or used normally as a check. [1]

[0] https://en.wikipedia.org/wiki/Hilbert_curve

[1] https://en.wikipedia.org/wiki/PGP_word_list


This is basically what S2 the library I use in https://wherewords.id does, except I don't use the pgp word list because it is too short and would lead to requiring too many words per place.


Well, the majority of the work behind w3w is the marketing.



I think it's a notably funny coincidence that "contact finance" is the prefix for some pretty expensive real estate.


At least for me, when I put in my address, it finds a similarly named, but incorrect address.

Makes it a little hard to use :-/


I'm trying to think of a use case for such a service. Any ideas?


I agree that this is a solution in search of a problem. Don't get me wrong. It's very clever and fun. But it also suffers from a lot of problems traditional solutions like street addresses and the use of landmarks don't. Those issues have been well-documented elsewhere.

The biggest issue this type of solution faces is a lack of standardization. There's this, What3Words, other people's hobbyist versions, Google has something like this built into their map product, etc. Every additional implementation is yet another nail in the coffin of the very concept.

The only way it would gain traction is if it were a government-mandated system. But every government already has one, and the benefits of adopting such a system don't outweigh the costs yet.


This is true, but only because none of the systems have critical mass. It'd only need a couple of car manufacturers to agree on a system like this for their gps and it'd take over pretty quick.


https://www.notion.so/what3wordsnotion/Auto-Partners-using-w...

Mercedes, Ford, Jaguar, Land Rover, Lamborghini, Mitsubishi, Subaru, Lotus, Triumph & Tata Motors all accept what3words.


The GPS coordinates system has a lot of buy in already


And it's rubbish for memorizing or entering quickly based on what you've heard over the phone. So much so that almost nobody I know uses it for gps in cars, they all use rather inaccurate zip codes or addresses that require a huge database on device, are fiddly to enter, are awkward to keep up to date and are very inaccurate.


For places that don't have an adress or the adress is ambigious


Telling a location over the phone when you are in an emergency comes to mind. Easier to remember than a series of numbers as is the case latitude/longitude.


> Easier to remember

Non-rhetorical question: who has to remember it and why?

Is this primarily a work-around for the problem of it not being possible for a mobile phone handset to display the location during a voice call? If so, maybe someone should fix that problem because there might be other things that the caller might want to refer to on the screen while calling.

It's slightly hilarious, in a way. I can imagine a conversation with the inventor: How much memory does a typical phone have? And you're telling me that we should use this proprietary system for encoding coordinates so that the user can more easily memorise the coordinates? Tell me, do you use a similar system for memorising your friends' telephone numbers?

Of course in the case of emergency calls it really should not be beyond the wit of man to implement a system so that the owner of a phone can configure it to automatically send its location to the other party when an emergency call is initiated. I'm fairly privacy-conscious but I'd probably enable that one.


> tell me, do you use a similar system for memorising your friends' telephone numbers?

I have used the major system for memorizing numbers including phone numbers. It's a very similar system.

Long streams of numbers are not great for memorizability or accurate entry or human communication. Did you ever try https://file.pizza/ ?

This is also why private keys are typically described as a sequence of words from a wordlist. Used in blockchains, PGP, keybase, etc. It does solve a real problem.


Have you ever seen the "ignore gps" roadsign?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: