It uses your browser location to build the data, bootstrapped with Mechanical Turk workers around the world. It has one feedback loop, where if you use the JS library it will give you the users location if they deny browser location (if we have the data...) and if they do share location, then we update the database (but throw away the last part of their IP address).
I'm probably going to do a small kickstarter to pay for some cleanup and expansion work or just kill it if that fails to get funded. Any ideas appreciated on things to do with it.
Cool project! I've been thinking about how one could build a crowdsourced geoIP database. Bootstrapping with Mechanical Turk is an interesting approach, especially if you can request workers in different cities and countries. One idea I had was a browser extension that volunteers could install to automatically submit their current location and IP address on browser startup.
It would be helpful if the home page demonstrated the passive geolocation results. The page asked for my browser location but didn't show the result.
btw the map zooming is very slow in my mobile browser for some reason.
Well right now a) we most likely have no idea where an ip address is and b) even if we do, it's nearest-city accuracy.... which is exactly what everyone else does.
Denying browser location doesn't stop the app or website from using your ip address against some third party service to figure out where you are.
>Denying browser location doesn't stop the app or website from using your ip address against some third party service to figure out where you are.
I didn't state that it does. However, just because you can do something doesn't mean you should. Did it ever occur to you that if someone explicitly tells their browser that they don't want to be located that you should respect that?
I hope you are never critical of Facebook's or Google's privacy standards, because there's a full-length mirror with your name on it.
I feel it’s a bit of a stretch to compare public information (what city does best guess say an IP is located in) and private information (where is my device right now).
The lookup against what IP is in what city is published in numerous public databases that anyone can look up against in small doses at no cost, or at scale commercially.
This service shines a light on that, doing away with illusions of privacy and showing you where you potentially have none.
Comparing to Google and Facebook feels disingenuous. You can’t lookup any person in their DB and see their records. It’s not public precisely because they’re hoarding what (in aggregate) is legitimately private data.
It’s still an illusion of privacy mind, as all the recent fuss finally coming about demonstrates.
The Google/Facebook comparison is made because those companies also deliberately ignore the user's privacy wishes.
Also, this isn't about locating someone to their city. According to the blog post, it returns latitude and longitude pairs.
Yes, these may not be precise in some circumstances, but as detailed in many recent HN posts, Google has many many ways to narrow that down to a very close approximation of where you actually are. And it's only going to get better at it over time.
Whatever location information is being presented is being displayed from a lookup from a publicly accessible database. It’s not a private DB that only one org has access to. This is the public library. The only cost of admission is putting in the effort to look. Google’s knowledge of you is private. It’s not comparable.
Should there be efforts to collect more information to make that public information more specific? Maybe. Can you actually delete already public information from the internet? Not really.
“Would you like to share what colour the sky is?”
You can choose not to. The webpage can still tell you. That information is public regardless whether you provide more specific information or not
Every web request is like a letter with a to: and a from: address. Your anger is equivalent to being mad that someone looked at the from: address on an envelope when you told them not to.
Also, your argument is that because the user doesn't want it, it is unethical? I don't want to sit in traffic but that doesn't make traffic unethical.
Similarly, you accuse them of 'violating trust'. It's public knowledge that any IP address can be looked up. Just because you weren't aware of it doesn't mean your trust is violated. In the same way, just because you didn't know something was against the law doesn't make it not illegal.
I am for privacy, don't get my wrong, but your comments represent one of the biggest challenges with privacy right now: the assumptions of privacy and trust. It's hard to have rational and productive arguments about privacy when people get emotional about the inner workings of the system. If you don't agree with the system, work to change it, but don't blame others for what is, at the end of the day, just a feature of how it all works. Instead, try to understand the feature and think about how we can implement future systems with similar functionality but more privacy.
>Your anger is equivalent to being mad that someone looked at the from: address on an envelope when you told them not to.
No, it's the equivalent of asking a woman in a bar if you can call her and when she says "no," you look up her number in the phone book and call her anyway.
Except she’s not forced to wear her full name on display at the bar (unlike IP addressing) and is able to opt out of being in the phone book (unlike GeoIP .
DBs).
I get what you’re saying and I see where you’re coming from, but to try and use this phone number analogy, it’s like telling someone what city/state they’re in based on their area code when they’ve opted to provide you no location information beyond their phone number.
The phone number itself contains location information. It’s not necessary accurate information as I could easily (and do) use a 212 number wherever I am I the USA, not just in New York.
Finally, we’ve had rulings about phone numbers and IP addresses. Phone numbers “belong” to the end user, not the operator, and move with the user if they want to. IP addresses “belong” to the carrier, and are non portable. In a number of cases, carriers actively provide city-level accuracy for where they’re using their IP space as it actively improves performance for end users.
One thing I will say for Maxmind is that their API was exceptionally good at picking up anonymous proxies like Tor.
I dealt with a ton of fraud a couple of jobs ago and being able to eliminate credit card transactions for people on Tor was huge in cutting down on charge backs from fake card use. The minFraud API was really beneficial as well.
I don't doubt that there are a lot of baddies using Tor (and I wouldn't be surprised if that was the majority of Tor users in some cases). It's a shame though that cutting off Tor access is the solution. A very reasonable one from a business standpoint, but unfortunate.
Doesn't have to be a majority of Tor users, if the tiny minority of actors are flooding tons of stolen cc transactions. A normal person may make an online purchase a day or so, while someone trying to get as much value as possible out of a bunch of stolen cc may just spam a lot of transactions.
For the server seeing the incoming cc purchase requests, it's still majority fraud...
Well, I am sure MaxMind provides a lot of value for many use-cases. However, I just need something very basic (country, city, lat and long) and getting this "out of the box" from Google's LB is blessing for me.
The author does specify that for his own needs, which was basically adding 3 headers to requests, the offering was not worth it. He also does explicitly say that MaxMind is very good. So I don't think he was unfair at all in the post.
For those not running on GCE, I run a free service at https://blip.runway7.net/ that piggybacks on the Google App Engine headers. Suitable for calling from the client browser / device, it allows you to ask for location specific resources straight from the client without having to do anything on the server at all.
That's great. Do you know if this also works on all cases when I use Cloudflare only as DNS? Edit: After some thinking this cannot work when Cloudflare is used as DNS only.
The load balancer expands variables to empty strings when it cannot determine their values, for example for geographic location variables when the IP address’s location is unknown, or for TLS parameters when TLS is not in use.
Geographic values (regions, subdivisions, and cities) are estimates based on the client’s IP address. From time to time, we update the data that provides these values in order to improve accuracy and to reflect geographic and political changes.
Exactly! I was using GAE for 9 years now and it's so easy to have these headers automagically being attached to your requests. Finally, Google Compute Engine has the same convenient way of geolocating requests.
I have always said that the GCP HTTP(S) load balancer is the hidden gem in cloud. I'm glad it's becoming more and more popular (also feature rich).
One feature that must be mentioned is that it allows you to cache content (could be anything: HTML, JPEG and more) for as little as 1 second! CloudFlare requires an enterprise plan and even then you cannot set a TTL lower than 30 seconds.
Does anybody know how good the data is? I mean everyone of the geolocation lookup providers has faults and choosing the majority from different sources will have the best results.
But maybe Google has much more accurate data?
This probably will depend on CDN. Google's CDN is part of the Load Balancer itself and therefore it won't affect geolocation. With "3rd party" CDNs it's different, of course.
Finally, the free version of MaxMind is great. Thanks for mentioning it.
If you are hosting outside of the Google eco-system and want to be able to do geo-IP for a store locator etc., or just to set a cookie on a page with the lat/lon, how does one build a minimal app in the Google cloud to do just this bit of the task?
If this is easy doable then that would be an easy migration route.
has anyone tried to compare geolocation accuracy of this solution vs Maxmind? I would assume Country-level will be very comparable (its easy to do), but doing city level accuracy is not trivial
Cool!! Location-based traffic insights have significant value in many use-cases. Show the results on a heat-map to see where you should be targeting customers.
We've done some testing and found many inaccuracy in Google LB Geo-IP results, anyone know who is in charge and which channel we can feedback to Google?
It uses your browser location to build the data, bootstrapped with Mechanical Turk workers around the world. It has one feedback loop, where if you use the JS library it will give you the users location if they deny browser location (if we have the data...) and if they do share location, then we update the database (but throw away the last part of their IP address).
There are two or three other loops where we could build more data (which you can get here https://www.open-geo-ip.com/data/download ).
I'm probably going to do a small kickstarter to pay for some cleanup and expansion work or just kill it if that fails to get funded. Any ideas appreciated on things to do with it.