Hacker News new | past | comments | ask | show | jobs | submit login

Could you expand a bit more on you comment? I feel I'm missing some context. Specifically, what do you mean by gold plated? Why is it tempting to ignore some aspects of distributed computing? I'm missing a lot of context that you are implicitly implying so could you elaborate?



It's gold plated because they basically built their own ISP by acquiring either:

a: dark fiber IRUs between cities/metro areas

b: N x 10 and 100 Gbps wavelengths as L2 transport services from city to city, from a major carrier such as level3 or zayo

c: some combination of A and B

and they use that to build backbone links between their own network equipment that they have full control over. Google is its own AS and operates its own transport network around the US 48 states and around the world.

the exact design of what they're doing within their own AS at layers 1 and 2 is pretty opaque unless you happen to be a carrier partner that is willing to violate a whole raft of NDAs. But basically they've built their own backbone to a very massive scale yet without the huge capital expense of actually laying their own fiber between cities.

their network has incredibly low jitter because they don't run their links to saturation, and know EXACTLY what the latency is supposed to be from router interface to router interface between the pairs of core routers that are installed in each major city. Down to five decimal places, most likely. When you have your own dark fiber IRUs and operate your own WDM transport platforms you are in possession of things like OTDR traces for your dark fiber that tells you down to four decimal places the km length of your fiber path.

It also helps that the sort of people who have 'enable' on the AS15169 routers and core network gear are recruited from the top tier of network engineers and appropriately compensated. If they weren't working for Google they would be working for another major global player like NTT, DT, France Telecom/Orange, SingTel or Softbank.


Where do you get the crazy idea that Google doesn't run its links to saturation? It's crazy because it would cost an enormous amount of money.

The B4 paper states multiple times that Google runs links at almost 100% saturation, versus the standard 30-40%. That's accomplished through the use of SDN technology and, even before that, through strict application of QoS.

https://web.stanford.edu/class/cs244/papers/b4-sigcomm2013.p...

A few more details about strategies here:

https://research.google.com/pubs/archive/45385.pdf

Then there's a whole bunch of other host-side optimizations, including the use of new congestion control algorithms.

http://queue.acm.org/detail.cfm?id=3022184

You might recognize the name of the last author...


No, it would be crazy for them to run things at saturation under normal circumstances as that does not allow at all for abnormal circumstances. The opportunity cost of not using something 100% all the time is offset against the worth of increased stability/predictability in the face of changing conditions.

Though you do need to define "saturation". Are you referring to bulk bandwidth or some other measure of throughput/goodput? Saturating in terms of raw bandwidth can reduce useful throughput due to latency issues.


What I mean is that they do not run their links to saturation in the same way as an ordinary ISP. And because their traffic patterns are very different than an ordinary ISP, and much, much more geographically distributed, they can do all sorts of fun software tricks. The end result is the same: Low/no jitter and no packet loss.

As contrasted with what would happen if you had a theoretical hosting operation behind 2 x 10 Gbps transit connections to two upstreams, and tried to run both circuits at 8 to 9 Gbps outbound 24x7.


For clarity, do you mean that Google can, for example, run to 99% saturation all the time, whereas a typical ISP might have 30-40% average, with peaks to full saturation that causes high latency/packet loss when it occurs?


Yes, that's about right. Since they control both sides of the link, they can manage the flow from higher up on the [software] stack. Basically, if the link is getting saturated, the distributed system simply throttles some requests upstream by diverting traffic from places that result in traffic over that link. (And of course this requires a very complex control plane, but doable, and with proper [secondary] controls it probably stays understandable, manageable, and doesn't go haywire when shit hits the fan.)


So I wonder if that means they can do TCP control flow without dropping packets.


I guess they do drop packets (it's the best - easiest/cheapest/cleanest - way to backpropagate pressure - aka backpressure), but they watch for it a lot more vigorously. Also as I understand they try to separate long lived connections (between DCs) from internal short lived traffic. Different teams, different patterns, different control structures.


@puzzle: while you're not wrong, do note that B4 is not (and is not designed to be) a low-latency, low-jitter network. It's designed for massive bandwidth for inter-datacenter data transfer.


running your own internal links to near saturation (such as a theoretical 100 Gbps DWDM or MPLS circuit between two google datacenters in two different states) is a very different thing than running a BGP edge connection to saturation, such as a theoretical 100 Gbps, short reach intra building crossconnect from a huge CDN such as Limelight to a content-sink ISP such as Charter/TWTC or Comcast.


Very much so. B4 can be run near 100% because of strict admission control and optimized routing to maximize the use of all paths. It's much harder to do that on peering links where the traffic is bursty and you don't have control over the end-to-end latency and jitter. SDN isn't a magic pill for this, but it can most definitely lead to better performance and higher utilization than Ye Olde BGP Traffic Engineering.


Building distributed systems makes you aware of how unreliable things are at the large scale, e.g. the network. The parent comment implies that Google's network is so fast and reliable that it becomes tempting to ignore best practices and work as if it's a non distributed system.


He's saying that it's so reliable and the throughput is so high that sometimes you have to convince yourself that your computers are halfway across the planet.


I'm not sure they're saying that, they're just claiming Google has really good and well run networks. But even Google hasn't solved the speed of light issue, packets can only travel so fast. If your computer is halfway across the planet, you'll notice no matter how fancy your network is.


I assume they are talking about things like the truetime clocks used in spanner, which are not available on commodity hardware


Depends on commodity. You can just buy GPS slaved rubidium clocks with PTP output.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: