AI is supercharging the creation of maps around the world

yorwba · on July 23, 2019

OpenStreetMap contributors weren't happy about the quality of tags when the import started: https://forum.openstreetmap.org/viewtopic.php?id=63456

Facebook wasn't the only company adding low-quality data: https://forum.openstreetmap.org/viewtopic.php?id=64430 (Previous discussion on HN: https://news.ycombinator.com/item?id=18723138 )

This article's quotes of what appear to be OpenStreetMap representatives are generally positive, so maybe that means they fixed all the problems they caused.

thomersch_ · on July 23, 2019

Nope, they just cherry-picked someone who is positive, which is not the case for most mappers.

tmcw · on July 24, 2019

OpenStreetMap contributors haven't been happy with anything, for the last 15 years.

tuukkah · on July 24, 2019

There are many positive developments such as good quality satellite imagery and goverment geographic data being offered to OpenStreetMap contributors for integration. An up-to-date aerial image of a city does wonders to the mapping experience. Mapillary's service complements OSM nicely as well as provides an invaluable on-the-ground data source. There's so many happy things in the community I can't possibly list them all.

habnds · on July 24, 2019

Agreed, Tranport for London recently built an enormous database of cycling infrastructure to be integrated into OSM. It's an exciting project/collaboration.

https://wiki.openstreetmap.org/wiki/TfL_Cycling_Infrastructu...

maxerickson · on July 24, 2019

Nor have their critics!

andygates · on July 24, 2019

It's never been 15 y-- oh heck, it has. Damn.

michannne · on July 23, 2019

Projects like these highlight that as ML becomes more and more complex, a gap grows larger and larger which must be filled by manual or semi-manual labor. If it isn't a team of volunteers combing the rendered results for errors, it's users who have succumbed to those errors and leave feedback in the hopes the system improves.

That gets me wondering - is the future of AI really just a semi-autonomous twilight zone where cheap / free labor augments an already faulty system? If not, what possible application is there for an expensive and closed automated system which works 100% requiring no human input, when other options are cheaper and leave clear directions for improvement?

tschwimmer · on July 23, 2019

Yes and no. There is a huge explosion of 'Ops Plus' startups that take an existing manual process and then build some basic tooling around it (with or without any substantial ML component). There are mild to moderate efficiency gains coming from this, but a lot of their valuation is coming from a bet that in the future they'll be able to fully automate the system and reap efficiency gains.

In practice, almost nobody is even thinking about building a fully automated process for every case. The reason is simple: automating the first 60% of work takes x effort, automating the next 30% takes 10x, and automating the next 9% takes 100x and operating the final 1% is essentially impossible. So if you came to the table with the goal of 100% automation right out the gate you'd spent 10 years developing something with little to show.

I think full automation of some systems is possible, but is actually blocked by generational norms. By and large the systems that "Ops Plus" startups are attempting to automate were designed by people who are not digital natives. They're not illiterate, but things like instant messaging, async communication and and structured data are not natural primitives for them. I'm not saying everyone in the Fortnite generation is a master data modeller, but I think that when they join the workforce they'll set up systems that are much more feasible to automate.

iguy · on July 24, 2019

What you describe is pretty much exactly parallel to the automation of physical work. Having a steam engine gets you a long way, then the first few robots a bit further... there's no particular reason to want a factory with zero employees (except the watchman and his dog...) unless you want it to work on Mars or something.

nicolaskruchten · on July 24, 2019

I’m interested in learning more about this “Ops Plus” term but it doesn’t seem very google friendly... Is this a term of art in some circles?

tschwimmer · on July 25, 2019

I actually just made it up. Companies that come to mind are Scale.ai (doing data labelling stuff for ML that historically would have been done by outsourcing companies, etc), Flexport (freight forwarding, traditionally done via spreadsheets and emails), Checkr (background checks), Atrium (legal services), Oscar/Clover (health insurance), Cadre (real estate investing). There are tons and tons of them in the recruiting (really sourcing) space too: I'd say Triplebyte is an Ops Plus company, as is Sourceress and a couple more.

mikepurvis · on July 23, 2019

Surely there's a spectrum here? Conventional, manually-made maps are full of errors too, and there's no army of low-paid workers combing over those— nope, they're published as they are, and it's up to end users to report problems. The better the map maker is (whether it's an ML black box or a team of human experts), the fewer errors there are, and either way, one expects an upward quality trend over time.

tomrod · on July 24, 2019

> If it isn't a team of volunteers combing the rendered results for errors, it's users who have succumbed to those errors and leave feedback in the hopes the system improves.

This is the idea behind GANs, as it stands.

letcree · on July 23, 2019

Very cool that they are integrating it with HOT tasking manager and making it easy for anyone to use the editor with the ML-generated proposed objects in several countries. I think the ML has been there for a couple years but currently it's not easy to take advantage of it.

Hopefully they eventually release the ML pipeline itself as well.

Their RapiD editor has some similarities to a research project I was involved with: https://mapster.csail.mit.edu/maid.html

(https://www.youtube.com/watch?v=i-6nbuuX6NY vs https://tech.fb.com/wp-content/uploads/2019/07/add_ML_road.g...)

Avamander · on July 23, 2019

I guess it's nice in those areas with low contribution precentage. But I suspect in many cases and definetly in Estonia there's a national high quality map database that could be used to augment existing maps instead. I'm wondering though, has anyone attempted to do so, quick Google search reveals nothing.

rmc · on July 24, 2019

It is possible to "import" other data into OSM. However it's very tricky. Licencing/legal issues aside, it can be very technologically challenging to merge another database into OSM.

I suggest posting to the OSM talk@ mailing list, or if there's a local estonian one too

photojosh · on July 23, 2019

Depends on whether the owner of said high quality map database will license it for inclusion in OSM.

My understanding, from following the Australian OSM mailing list, is that it takes an individual to pursue this with a government agency, which is a ton of work, and often you'll just get a 'no'.

rihegher · on July 23, 2019

They say it uses a deep neural network but we don't know this is a convnet or another type of NN. Anyone has more technical infos about it?

gwern · on July 23, 2019

This is semantic segmentation, so it has to be a convnet. I've never heard of trying to do semantic segmentation any other way, and a prototype of another approach wouldn't go into production at scale.

OP links to https://ai.facebook.com/blog/mapping-roads-through-deep-lear... which says that it's D-LinkNet specifically: http://openaccess.thecvf.com/content_cvpr_2018_workshops/pap... (more or less a Unet).

joshvm · on July 23, 2019

I don't disagree, but I'd be careful with that statement.

A huge amount of landcover segmentation in remote sensing still relies on simple models - either linear regression (thresholds) or classical machine learning like random forests or SVMs. For a lot of cases, these techniques will get you 90% of the way and it's very rare to have ground truth data that is accurate enough that you can measure the difference with any real degree of confidence.

A big problem in the field is the lack of good (public) ground truth. There's so little hand labelled data to work with that without humans in the loop it's extremely difficult to validate the results meaningfully (unless you have an army of staff to do it). With something like roads you could also have heuristics about what a road looks like and where it goes (e.g. it's a continuous thin line), which can help condition things.

I've seen a lot of papers which are applying deep learning for semantic segmentation for satellite mapping, but they evaluate on very limited datasets, they attempt to regress to simpler models without realising it (e.g. trying to predict a linear model), or they leak train and test data and report amazing results because they randomly split data from the same region.

I'm not saying that convnets aren't better than simpler models, but particularly for satellite imaging I'd take them with a pinch of salt and see what the improvement from a baseline method is. If you look at a random sampling of papers from the DeepGlobe competition, almost none of them provide the results from e.g. a cheap linear SVM.

Fun side note - several existing "famous" datasets generalise poorly to the developing world because most of the imagery is from the developed world (and even more specifically the West) and infrastructure looks totally different.

junipertea · on July 24, 2019

How do you give a class to each pixel in an image using a linear classifier in a way that uses the surrounding pixels as context/input? I'm genuinely curious! You are right about the data, it's expensive to make and startups based on satelite imagery tend to keep them as it's their main advantage.

joshvm · on July 24, 2019

In a most cases, you can reshape to a 1xN^2 vector for a NxN region. This was how object detection worked long before convolutional inputs were popular.

Have a look at mnist classification using a linear SVM, for example.

junipertea · on July 25, 2019

Classification makes sense, because you do a linear (or kernel) combination of the input and squash it using sigmoid to get a probability of a class. For segmentation you output a pixel mask so you would have a NxNx3 vector to predict 1 class for 1 pixel and then you would have to do it for all pixels so you'd have to encode the position as well. Alternatively, if you take a unique weight for each position you end up with get single FC layer with NxNx3 inputs and NxN outputs (N^4 parameters). I guess for me it's hard to imagine doing segmentation "back then" and I find it very fascinating.

li4ick · on July 24, 2019

This [1] is one of the best in current research. It tries to actually follow the road. So it naturally builds the road map.

[1] https://www.csail.mit.edu/news/mitqcri-system-uses-machine-l...

dr_faustus · on July 24, 2019

Is it open source? It is somehow implied, but I can't find the source.

m4eta · on July 23, 2019

Maybe AI can help with Lyft/Uber. I'm sort of tired of being dropped off in alleyways.

jdblair · on July 23, 2019

The great thing with Lyft/Uber is that I'm in the car and I can say "please take me around front to the main entrance, not here in the alleyway."

andygates · on July 24, 2019

That works for regular cabs too! /s

rmc · on July 24, 2019

This is the point about OSM. Local people can just fix it. We don't need _artifical_ intellegence, when we have real intelligence.

workingpatrick · on July 23, 2019

My first thought is wondering what data FB hopes to acquire from making this tool available.

gwern · on July 23, 2019

FB is far from the only big tech company which has sponsored OSM as a competitor to Google Maps (Microsoft/Facebook/DigitalGlobe/Telenav/FourSquare/Craigslist have all sponsored OSM to some degree; Apple, of course, went its own way and created Apple Maps).

It's a reaction to Google Maps: a monopoly on high-quality up-to-date global maps with business location is dangerous to everyone else, as a chokepoint on mobile applications. It's less about 'acquiring data' and more about not being extorted by GM. Classic 'commoditize your complement' dynamics: https://www.gwern.net/Complement

gashad · on July 24, 2019

Yes, see also "Corporate Editors in the Evolving Landscape of OpenStreetMap" https://www.mdpi.com/2220-9964/8/5/232/htm

Figure 3 demonstrates the scale of corporate OSM edits.

rmc · on July 24, 2019

Apple Maps is based on OpenStreetMap, and they use OSM in many countries. I believe they use OSM for turn-by-turn routing in Denmark (or was it Netherlands) (source: Apple gave a talk at the OSM conference (SotM) in 2018, but required it not be recorded).

letcree · on July 23, 2019

Their motivation is self-serving but I don't think it's so nefarious. They use OpenStreetMap in check-in posts to display map data around locations that Facebook users have visited, so improving OpenStreetMap in turn improves the quality of this feature.

Groxx · on July 23, 2019

Training data from OSM volunteers, at the very least. Probably both cheaper and higher quality than e.g. mechanical turk.

andygates · on July 24, 2019

Cheaper location data than they'd get licencing one of the other map providers.

rmc · on July 24, 2019

They want to be seen to be doing something with computer vision/AI.

zyang · on July 23, 2019

User location data is useless without the substrate of a proper map.

rainyMammoth · on July 23, 2019

exactly my thought as well.

Whenever Facebook releases something we should all directly worry about what their real motives are.

photojosh · on July 23, 2019

They need good quality maps.

I worry that they'll try to embrace and extend in some way.