Hacker News new | past | comments | ask | show | jobs | submit login
AWS inter-region latency chart (cloudping.co)
93 points by mooreds on April 30, 2021 | hide | past | favorite | 57 comments



Here's the equivalent for GCP, FYI: https://datastudio.google.com/u/0/reporting/fc733b10-9744-4a...

Ironically less colorful. :)


Mhmm I would have expected the co-located source/destination numbers to be 1, slightly lower, and 2, more consistent between regions.

Ie, I would not have expected eu-central-1/eu-central-1 to be 1.68ms, and us-east-2/us-east-2 is 8.26ms (almost 5x)


AWS's availability zones are unlike Google Clouds AZs, as they are physically meaningfully distant.

A geological or technical event near one AWS AZ is highly unlikely to also affect the others of that region, unless the event is regional (heh) in scale. For Google Cloud, the AZs are effectively co-located buildings (or at least, Eemshaven has 3 GC AZ DCs at < 1km distance from each other), making even local disruptions reasonably likely to impact availability for the whole region.


Yes, that being said, I would expect those same precautions to be taken in all regions, and thus not give a meaningful difference per region, between regions (if that makes sense).


Different regions have different types of worries. Distance and precautions required to deal with redundancy for an earthquake may not be the same as required for a tornado, or hurricane, or flood, etc. In some cases, I imagine elsewhere in the same large city may suffice. In others, maybe it requires 50-100 miles.


You're correct that some Google Cloud zones are in the same physical building. However, the separate zones are designed with independent power, cooling, and networking. So, failure events usually affect only a single zone.


Isn‘t it interesting how less one knows usually about exact specs of the Cloud?


Us-east-1 aws buildings are within couple miles too


us-east-1 is a very special region. All of Amazon.com runs/ran there for a long time. And a lot of global features of AWS itself run there too.


The chart doesn't show AZ data, which seems like an important piece of this puzzle.


Perhaps OP can answer. The AZ ID[0], as opposed to the name, does identify a specific AZ location, so certainly possible to measure AZ-to-AZ latencies.

[0] https://docs.aws.amazon.com/ram/latest/userguide/working-wit...


It would be interesting to also see what the lowest possible latency between the regions would be so you can see how much overhead the interchanges are adding.

For example, the distance between us-east-1 and us-west-1 is approximately 2,500 miles. Light travels at 186,000 miles/second, so that's 13 milliseconds one-way. So the fastest possible TCP SYN/ACK/SYNACK is 39 milliseconds between the two DCs just to establish a connection.


Light travels at 186,000 miles/second in a vacuum. A quick websearch says fiber's index of refraction is ~1.467. One round trip then takes 39 ms (coincidentally the same number you said for 3 legs), vs the 62.36 ms on this chart. In theory they could reduce latency by ~23 ms then with a truly direct path. To do better would require abandoning fiber.



Neat! I'd never heard of that. I look forward to the day when it's generally used rather than yet another cool technology wasted on stupid high-frequency trading.


It's actually being deployed for HFT in London right now by euNetworks.


I spent about 15 minutes trying to create a custom Distance function in Google Sheets:

    const DISTANCE = function(a, b) {
      [latA, lonA] = a.split(",");
      [latB, lonB] = b.split(",");
      latA=parseFloat(latA);
      lonA=parseFloat(lonA);
      latB=parseFloat(latB);
      lonB=parseFloat(lonB);
  
      return 2 * 6371000 * ASIN(SQRT((SIN((latB*(3.14159/180)-latA*(3.14159/180))/2))^2+COS(latB*(3.14159/180))*COS(latA*(3.14159/180))*SIN(((lonB*(3.14159/180)-lonA*(3.14159/180))/2))^2))
    }
So that I could just paste concatenated "lat,lon" coords in place of the AWS region names and just compute e.g.

    =DISTANCE(A2,B1)
Into a second sheet, and then just divide the first sheet by the second sheet as a third sheet.

but I kept getting some "#NAME" error and I don't have time to figure it out.

Bad UX, Google. This should have been easy.

Better yet, you're freaking Google, why isn't there a DISTANCE("Hong Kong", "Singapore", "straightline") function?

I have to get back to work but if someone can figure the rest of this out please comment back.


If you’re going to calculate distances between two geological locations, then you want to calculate Great Circle distances, as that will be the shortest path at the surface of the planet.

A straight line distance might require tunneling down into the planet a surprising amount of distance, and is an absolute lower bound on the possible distance between the two points, but is not a realistic distance that could actually be achieved in most cases.


The formula above should be great circle distance, if I didn't screw it up.


I used to read ThousandEyes's annual free reports on networking capabilities and measurements across the Big 3 cloud providers, which were quite comprehensive in terms of covering various scenarios [0]. I don't think they publish those anymore.

With cloudping.co, it isn't clear from the website, but it seems like the inter-region latency they measure is over the public Internet. The Big 3 run their own backbone across all their DCs around the globe, and so I reckon, those numbers would look vastly different if traffic was instead relayed through those uncongested backbones.

With AWS, the cheapest way I know (in terms of development time and cost) to accomplish inter-region over their backbone is via Global Load Accelerator. The clients connect to GLA at the nearest anycast IP location advertised across 150+ AWS PoPs. You can then play with endpoint-groups, connection-affinities, source/port tuples to control routing traffic to different backends in various regions.

We used this technique to prototype a VPN with exit nodes in multiple countries but entry nodes closest to clients at every AWS PoP. It worked quite nicely for a toy: https://news.ycombinator.com/item?id=21071593

[0] https://www.thousandeyes.com/blog/top-takeaways-cloud-perfor... (2019)


> With cloudping.co, it isn't clear from the website, but it seems like the inter-region latency they measure is over the public Internet

If they are running from AWS to AWS, it's going over the AWS backbone.


Is that true? I thought you had to do VPC peering for regions to use those backbones


Inter region traffic always goes over the backbone (this includes EIP to EIP). This also includes going from EC2 to any service like S3 in another region.

Except China. China to rest of world is not via backbone.


I don't see this in the docs


China is super special. Stay out of that region unless you have a special China reason to go in there.


It should be. Also this was covered in a reinvent presentation about the aws network.


I doubt unless you're using VPC peering, Transit Gateway, or Private Link that it would be the case that user-generated traffic between regions (for ex, between EC2 instances in Dublin and Sydney) is automatically routed through their backbone. Can you point to the re:Invent presentation? Genuinely curious.


It really is, for all AWS-AWS traffic barring Beijing and Ningxia. src: I worked in AWS networking for 2.5 years

here's DUB->SYD:

HOST: amazon02.ring.nlnog.net Loss% Snt Last Avg Best Wrst StDev

  1. AS16509  ec2-3-248-240-73.eu  0.0%    10    7.1  22.8   1.0 125.7  38.1

  2. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

  3. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

  4. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

  5. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

  6. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

  7. AS???    100.65.15.3          0.0%    10    0.8   1.1   0.2   6.7   2.0

  8. AS???    100.95.19.145        0.0%    10    0.3   1.4   0.2   5.5   1.9

  9. AS???    100.100.4.12         0.0%    10    0.3   1.1   0.3   7.0   2.1

 10. AS???    150.222.242.237      0.0%    10  254.0 254.0 253.9 254.4   0.2

 11. AS???    52.95.36.164         0.0%    10  257.2 255.7 254.0 257.3   1.3

 12. AS???    150.222.112.139      0.0%    10  253.4 253.9 253.4 255.1   0.5

 13. AS???    150.222.112.142      0.0%    10  259.6 257.8 255.5 266.8   3.5

 14. AS???    52.95.36.143         0.0%    10  255.3 256.1 255.2 261.4   1.9

 15. AS???    52.95.38.17          0.0%    10  255.0 255.6 254.9 259.9   1.6

 16. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

 17. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

 18. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

 19. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

 20. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0

 21. AS???    100.65.16.65         0.0%    10  256.7 279.8 256.7 334.4  26.1

 22. AS16509  amazon08.ring.nlnog  0.0%    10  254.4 254.4 254.3 254.4   0.0


Thanks. To confirm: You're pinging between the EC2s using their public DNS, right?

If AWS backbone is used automagically, I wonder why would anyone pay for Transit Gateways or VPC Peering rather than do mTLS between their cross-region instances or tunnel via Wireguard-esque transports like tailscale or defined.net, for example. Also, since when has this been the case, if you'd know?

I'm curious what the bandwidth charges are for EC2 to EC2 cross-region when using their public IPs / DNS? Same as VPC Peering?


Yep, public IPs. I’m sure people do that, or use VPC peering if they want to use private IPs.

Expensive. VPC peering serves a different purpose, but pricing is the same.


Thanks a lot.

> Expensive.

VPC Peering bandwidth rates are $0.01 / GB. EC2 (public Internet?) bandwidth rates are $0.09 / GB. For xfers between EC2 to EC2 via AWS backbone, I assume I'd still be charged the public Internet bandwidth rates, right?


thank you. TIL this


Yeah, you need any among VPC Peering, Private Link, Transit Gateway for inter-region (global) backbone connections.


Great information!

My only suggestion would be that a column and row hover wold make the data easier to inspect and navigate.


This might be a rare case that actually benefits from being an interactive map with lines colored differently/dotted differently to indicate interconnect speed (of course, keeping the table for non-js compatibility).


I don't know how they are accounting for the various different data centers within a region (IE. Availability zones). It could explain self latency if a packet is leaving a DC


> various different data centers within a region (IE. Availability zones).

A single availability zone is often (always?) composed of multiple datacenters which are spread relatively far apart from one another.


Some AZs have multiple DCs btw


AZ usually have 2-5 DC if i recall correctly


Most regions AZs are only a single DC, a few have more than one per AZ.


i might have remember it wrong.

does multiple buildings at single site count as multiple DC? or it must be multiple physically separate locations?


In My use of DC I’m referring to a single physical building.


Is there a list of no. Of DC per AZ? I couldnt find it anywhere


No, its only been mentioned at ReInvent publicly, or via the WikiLeaks AWS leak.


anecdotally the connections between AZs are so low latency you often can't tell whether a server is in the same or a different AZ


Definitely not true. Cross AZ is incredibly high compared to within the same AZ. We run a bespoke multi-master database setup that must be colocated to the same AZ due to unacceptable latencies when run spread across 3 AZ's. A few ms difference


(i work at aws)

yes, while in my opinion most services won't care about the distinction between cross-az/same-az (you're definitely not most services :) ), you can definitely tell the difference between

* in the same region vs not

* in the same az vs not

* in the same placement group or not

which shouldn't be surprising. you should get a latency benefit from increased physical proximity!


1) <10ms should be gray since those are effectively in-region numbers (and it would make the chart easier to understand quickly)

2) Times should be tagged with a theoretical minimum time and how far off the real number is from that.

3) (Bonus points) Times should also be tagged with a 'real' minimum time that is based on the lengths of the actual fiber lines and number of interchanges.


This is quite useful. Would love to see it extended to different providers (not just the latency internal to each provider but between them).

For example, I'd like to find a set of 3 locations across 3 providers that have the lowest latencies between them.


This looks great, but I wish that this was more colorblind friendly.

Note: For anyone else that is on a Windows 10 machine and is struggling to see the difference, there is a colorblind mode that you can toggle with Winkey + Ctrl + C.


Might be neat to see a digraph of this. Would it look like the globe?


Is there supposed to be a way to hide regions? The "Enabled Regions" menu doesn't appear to do anything for me.


South Africa must have crazy latency considering no other region is even under 50 ms from it.


Have you looked at a map recently?


I was wondering where this could be useful.


Ah, I found this when we were discussing internally building out a multi region installation of one of our products. We were discussing the tradeoffs between a single master database with calls from other regions to that database, vs replicating a database between regions. The latter is more complicated to run but has far less latency.

That's when I found this chart through the magic of google to bring some real numbers (tm) to the latency discussion.


thank you




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: