Hacker News new | past | comments | ask | show | jobs | submit login
Google Public DNS: 70 billion requests a day and counting (googleblog.blogspot.com)
105 points by abraham on Feb 14, 2012 | hide | past | favorite | 52 comments



Just to add a couple data points:

  1) Tons of random indy web crawlers, mail cannons, etc. use Google Public DNS.
  2) We actively discourage large machine-based usage of our service.
Even with some aggressive handling of large machine-based users, we still grow at a crazy clip.

Our growth: http://i.imgur.com/znfu9.png


Took me a while to figure out who the "we" is (OpenDNS). Might be worth stating it in your comment.


Any idea on the limit for "large" machine based usage?

For examples, I use it with a PHP script that hits it a few hundred times and have not yet had a problem. Should I be expecting one?


We had a 500 cluster web crawler using us. Probably doing about a billion queries a day. They had the option to pay us or switch to Google Public DNS. Guess which they did? :-)

A few hundred, even if a few hundred per minute, is not a problem.


I have operated a rather large crawl cluster on EC2 (several hundred nodes), and I don't see why anyone would point one at a third party DNS service.

we operated for about 6 months using the default resolvers than come with the AMIs, then Amazon contacted us, telling us they wanted us to stop using their recursive nameservers... apt-get install bind, point at 127.0.0.1, redeploy AMI, done (in under 5 minutes).

it isn't as if a crawler cares about an extra 250ms from non-cached entries, and the bandwidth from DNS is trivial* compared to that of downloading pages.

... so why would anyone ever offer to pay you/anyone else for a recursive DNS service? it's a trivial problem...

(* say 32 bytes for the query, 64 bytes for the response... 1B lookups is ~64gb, or $3.50 with Amazon's very expensive bandwidth costs)


250ms matters. That's 1/4 of a second. It matters for a crawler tremendously.

DNS query sizes are wrong. I'd double each. But still, inexpensive from a bandwidth standpoint. I get it.


why would it matter at all, given a crawler hits many pages on the same hostname? the fact the initial request takes 250ms more is meaningless.

when you also take into effect that crawlers are either: bound on sleep() if they're friendly, bound on cpu if doing processing and you have a lot of money for bandwidth, or if you don't have much money, bound on bandwidth.

and given that crawlers tend to be massively parallelised, the DNS query could take minutes and you really still wouldn't care...

(go and read up on Amdahl's law)


Inbound bandwidth to Amazon has been free for a while. Anywhere but Asia and South America, your posited 32 byte query adds up to about $0.36 per billion at their most expensive $0.12/GB tier.


> They had the option to pay us or switch to Google Public DNS.

Do you mean to switch away from Google Public DNS?

> Guess which they did?

How much did it cost them?


No, he's talking about a different DNS - OpenDNS. It's a confusing thread - the top comment is by a guy running OpenDNS. OpenDNS is a DNS service which resolves typos, blocks typo fishers, and sends empty pages to an OpenDNS ad page. They also have a bunch of options and controls, letting you block some sites (useful for employers, schools, and parents).

It sounds kind of sleazy (redirecting "no record" to an ad seems a little off to me), but the guys running it are apparently not. The prejudice against redirects to ad pages is more a result of ISPs who take your money and still give you ads, unlike OpenDNS which is free.


> It sounds kind of sleazy (redirecting "no record" to an ad seems a little off to me)

It doesn't just seem off. It is off: it's a major violation and a large part of why I don't use OpenDNS.

When a name doesn't resolve, I want to see my browser's page for a DNS failure. I don't want to see ads.


It's a free service. Unless your company forces it to use it, OpenDNS is opt-in.

As I said, it makes people mad because ISPs effectively force you to use their DNS, and some use it to serve ads.

Nobody complains about Google serving ads on their pages, but if your ISP inserted ads through some kind of MITM, it would make people pretty angry.


Further, on OpenDNS, you can disable these ads if you so choose.


No, he meant they had the option to start paying OpenDNS or to switch from it to Google Public DNS.


This is all on PowerDNS?


Nope... but we like Bert (the author).


Being in the information security world, I have to wonder what percentage of these requests are based in malware? I know at least the latest version of ZeroAccess/Max++/Sirefef (which we managed to get before the AV vendors released definitions for it) uses it quite heavily. That's one of the symptoms we used to diagnose computers from a strictly network-level standpoint. No one on our network should be using Google DNS, so any computers who were making requests to 8.8.8.8 were likely infected (confirmed using other signatures).

That amounted to about 100 requests every day per infected computer just from us, and ZeroAccess isn't the only one doing it (and isn't a rare trojan).


For those thinking about changing their DNS servers, you might want to take a look at http://code.google.com/p/namebench/

It benchmarks global (like Google Public DNS and OpenDNS) and regional DNS providers to show which DNS servers would be fastest for you.


For comparison, OpenDNS does ~37bn http://www.opendns.com/technology/traffic-stats/


That's a little over 800,000 requests a second on average. That's some serious traffic.


I'm wondering if there's a detectable peak in people switching to Google DNS because of censorship in countries like .be & .nl


I just switched to Google DNS. Before I switched I got a 38ms response time when I ping google.com, now I get 285ms? I'm in the UK. Is this why it's slow?


It's possible that you are suffering from mistaken redirection to a different CDN.

You should read up on how CDNs work (http://en.wikipedia.org/wiki/Content_delivery_network) but the gist is essentially that your DNS server determines which content node you're routed to. If your DNS server is your ISP's DNS server it's very likely that you're geographically close.

However, public DNS is usually anycasted to more general regions. In this case, you may be then routed to a content node which is close to the DNS node, but farther from you.

Google actually does support the EDNS extension in order to help solve this problem, but I find it unlikely that ping supports the extension.

This is a big problem with public DNS services. My research group will be releasing a project soon which seeks to alleviate the problem and dynamically choose DNS based on what provides you the best performance.

Edit: Actually I think I may be wrong about one point: IIRC the actual DNS clients don't need to support EDNS-client-subnet in order for it to be used. Only the DNS and authoritative DNS/CDN need to support it. Usually that's a problem because most of the major CDNs don't support it yet, but Google actually does support it on both their CDN and public DNS. Therefore you should be routed based on your prefix when talking to google.com, not the anycasted 8.8.8.8 node. Thus, I have no idea what's going wrong.


The DNS is probably resolving google.com to an IP that is further away network-wise. You could check to which IP it resolves depending on your DNS.


DNS has nothing to do with ping response times. A ping requests the IP address from the DNS server and then does the ping. On most systems the DNS is cached locally, so multiple requests will use the same information.


DNS has nothing to do with ping response times.

Not true, the feature is called GeoDNS and gives you different IP addresses for the same domain based on your DNS server. See Locke1689's reply.


I don't think that DNS lookup time is included in ping's latency numbers. It probably does the lookup once per run, and then caches the IP. It wouldn't make sense to to the DNS lookup once per ping.


Seconding zacgarrett. Try using nslookup or dig to get a more accurate measurement on how the change in DNS has affected response times.


Glad they're doing something about the CDN issue. This is the main thing that keeps me from switching to Google DNS.


I used to use this but switched off over security concerns (Mainly because I didn't want Google everywhere)

I haven't noticed any difference in load times so I guess it didn't do any harm.


It would be funny if they suddenly dropped the service because it wasn't generating a good "revenue stream". It would be chaos for awhile.


I would be interested if statistical data gleaned from DNS makes it's way into any other service areas. DNS would seem like a useful way to rank the popularity of web sites, I am sure there are some interesting enhancements that could be made using that data.


I think their privacy policy disallows it: http://code.google.com/speed/public-dns/privacy.html

> We don't correlate or combine your information from the temporary or permanent logs with any other data that Google might have about your use of other services, such as data from Web Search and data from advertising on the Google content network.

And they say that the logs are only used for debugging, DoS protection and abuse.


The quote from the policy says nothing about using or not using the information about what DNS queries are made for data mining or ranking.


Data mining and ranking is neither debugging, nor DoS protection or abuse protection, so I don't see how it could be allowed.

The FAQ makes it even more clear:

> Is information about my queries to Google Public DNS shared with other Google properties, such as Search, Gmail, ads networks, etc.?

> No.


I did not mean "my queries" being "shared" but the query collection/archive as a whole. It would be another source to know what domains the people visit.

> And they say that the logs are only used for debugging, DoS protection and abuse.

Source?


Rereading it, I see it's less clear how it applies to aggregated data.


I for one am very thankful for this DNS. Ever since moving to China, 8.8.8.8 has useful for getting around various flaws of the internet experience here. Even with a paid VPN, it's nice to have an always working DNS server.


It seems like Google's DNS servers are always MANY hops away from me, no matter where I am. The ping times range from 8ms to 45ms.

How is that faster than using a local DNS server?


Obviously it cannot be faster than a local DNS cache/server (for the RTT). What can be significantly faster is forwarding queries to Google Public DNS instead of your ISP. At least in my experience, ISPs often have slow and overloaded servers which are not well maintained. Recursively resolving completely on your own is often even slower than either of those two choices.


ISPs often also have abusive DNSes that redirect unresolved domains to ad-ridden landing pages, which is why I switched.


I have been running unbound for a month, and there is no noticeable difference in resolving time. This article provides a good read: http://linuxmafia.com/~rick/googledns.html


45ms beats the hell out of the hundreds, or thousands, that crappy Irish ISPs deliver from their own 'local' name servers. When they're working at all.


What an enormous data mine.


Down votes for pointing out that Google DNS is an enormous data mine? Why?


[deleted]


especially one with such uniformly poor latency when compared to ISP-cobbled together crap it's supposed to replace, anywhere I've ever seen it deployed.

That seems overly negative: Google DNS is consistently better performing for me than other public DNS services except for Level 3's 4.2.2.1 &c because I'm on their network.

Google's public DNS uses anycast, so the performance should be good in most places, perhaps your experience is the result of your geographical location or ISP's network?

Care to elaborate on the poisoned reddit records?


> other public DNS services except for Level 3's 4.2.2.1

Level 3 doesn't actually run a public DNS service, they run DNS servers that happen to permit requests from non-customers. You'd be well advised not to use it outside L3's network.

http://www.tummy.com/Community/Articles/famous-dns-server/


What do you mean by poisoned? For me it's the contrary, I switch to Google Public DNS whenever I'm using an ISP with "lying" DNS (returning a search page instead of NXDOMAIN).


See: http://en.wikipedia.org/wiki/DNS_spoofing

The basic technique takes advantage of the fact that DNS allows you to provide additional information in a response so the response for ev1l.hax0rs.org can return a reply which says "This is handled by ns.reddit.com. Oh, by the way, ns.reddit.com is 1.2.3.4"; any server which doesn't properly validate that last part would add the incorrect ns.reddit.com record to its local cache and potentially use it to handle requests for other clients.


But why would Google Public DNS but more affected by cache poisoning?

They describe their protections in details at: http://code.google.com/speed/public-dns/docs/security.html


That's a better question for the original poster - I described the generic technique but haven't heard of anyone successfully applying it to Google Public DNS.


I recommend using the Google DNS to all my friends because all the local ISP DNS servers are censoring a bunch of sites.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: