Hacker News new | past | comments | ask | show | jobs | submit login
Google Kubernetes Engine adds support for Arm nodes (cloud.google.com)
119 points by crb on July 13, 2022 | hide | past | favorite | 49 comments



People would probably be more interested in the underlying Google Cloud ARM announcement: https://news.ycombinator.com/item?id=32084887


Yup, my team did a little copy-paste lab if folks want to kick the tires: https://github.com/sadasystems/gke-multiarch-guide


Also, the PM's are going to do an AMA for this on reddit tomorrow:

https://www.reddit.com/r/googlecloud/comments/vy8hx3/ama_wit...

Come join us! Also, hi Moles :)


Great. Hopefully Azure is next, so we can get ARM Github Actions runners on which I can build ARM binaries. I can already do it with qemu-user but it's slow.


You can already self-host GH Runners on ARM.


What kind of ARM machine should I buy if I want to self-host an ARM server? My not-that-recent look at the market shows everyone kind of doing their own thing; Amazon makes Amazon's ARM servers, Apple makes Apple's ARM chips, etc. As some random guy who wants to test on ARM, it's annoying. Maybe things have improved recently, though?

(I'm guessing self-host realistically means "get a VM on AWS", which is probably fine for CI if you already use AWS. A little annoying to have another monthly bill to pay if you don't, though.)


The ARM nodes which are exactly the topic of this submission could be a place where you can self-host your GH runners :)

https://github.com/actions-runner-controller/actions-runner-... works quite well with its autoscaling. You could create a zonal GKE cluster, which is in free tier, and create a small spot vm node pool with ARM nodes. It wouldn't be entirely free but it would cost quite low amount of money.


Your best options IMO are either 1) a Mac Mini with 16GB of RAM running a Linux VM or Asahi Linux connected OR 2) an Nvidia Jetson Xavier AGX; you can get the 32GB version for like $700 USD I think. (The newer Orin is a fair bit more expensive.)

You realistically want something with ARMv8.2 or better, these are relatively easy to acquire with the current supply chains, they're beefy, can be equipped with fast storage, and they are both small units you can put on your desk. Note that the Xavier will require you to fiddle with the usual Nvidia bullshit through their SDK but it's otherwise a standard Ubuntu machine. It should be possible to get another distro on there too. The M1 will almost certainly perform better overall watt-for-watt though.


Buy a Mac mini and install Linux on it? That feels like it is (or will soon be) the best option.


The point is to not have to self-host. It costs money, is a pain because maintenance is required, and there are security issues with running CI for public pull requests.


With garm (https://github.com/cloudbase/garm) you can spawn ephemeral runner VMs, be it as Azure/GCE ARM instances or as lxc VMs (or lxc containers).


Azure already has ARM VMs (as well as AKS)?

The equivalent would be cloud build, not GKE <-> GH actions, right? does cloud build support arm runners?


Looks like it's only available in NA central for now, boooo: https://cloud.google.com/compute/docs/regions-zones#availabl...


Where did you want it?


In every zone, obviously. AWS has had widespread ARM support for many years.


I mean first gen Graviton was announced in late 2018 during reInvent. And it wasn’t even generally available until mid 2019. It was underwhelming, but layering some required software foundation. Graviton 2 was in late 2019 and wasn’t GA and available in many region until 2021.

Hardly widespread support for many years.


It's quite light on details: What properties does the (virtualized) ARM CPU that one gets have? Can it do KVM itself (meaning the underlying real CPU has support for nested KVM)? Can it do ARMv8.3 pointer authentication?


It's an Ampere Altra for those first instances. So no on both.


They could have named Ampere and give credit to them, I think it's common in cloud instance offerings to have AMD or Intel also being named as underlying physical CPUs, even with the generation.


They did in the docs, 2nd paragraph of the blog post, etc

https://cloud.google.com/blog/products/compute/tau-t2a-is-fi...


Thanks, yes, makes sense to look at the GCE announcement instead of the GKE announcement for this info.


I'd be curious to hear what the use case is for nested KVM. What hardware support is required for that as well?


The instance exposed by GCE is virtualized. If you want to run any hw virtualized workload inside it, you need nested virtualization.


I'd be curious to hear more about your Kubernetes workloads. What virtualized hardware do your pods require?


Any untrusted workloads (say CI runners running your clients arbitrary code) better be run inside kata containers so you can’t use t2a vms for that


In GKE you can just enable GKE Sandbox/gVisor on a node pool to run your untrusted workloads. gVisor serves the same purpose as Kata containers.


Yes except slow io


Can you elaborate? What type of I/O, network, disk? What is the issue exactly?


You can refer to gvisor performance docs - https://gvisor.dev/docs/architecture_guide/performance/#file... throughput is really terrible, same deal with networking and also if your userland issues a lot of syscalls


Thanks for the link. I'm curious, how is the I/O performance with Kata? Does it use VirtIO?


It does and I believe with dax (if your kernel has it) it’s basically same speed as regular runc. 9p is still default tho i think


Pricing is in line with the cloud: expensive. That's it, they made it: now Arm servers are mainstream.


I always found the CPU pricing of cloud to be relatively reasonable. It's everything else that's expensive, and egress bandwidth just ridiculous.


I've done some cost analyses between our AWS and DC infrastructure.

To come up with our on-prem compute costs, we baked in the cost of power, real estate, staff, taxes, network infrastructure, servers (both in-use and in reserve), etc. On the AWS side, we used 3 year RIs and Savings Plan. After all that, there was around a 30% cost advantage on-prem. That's non-trivial, but not as big as one might think.

Outbound networking, however, is ludicrously cheaper on-prem. It's about 85% cheaper on-prem than in AWS. Bandwidth is not expensive outside the public cloud.

In fact, egress volume is the #1 cost driver for us moving a service on-prem or building it there to begin with. Some of the AWS managed services are also very pricey, but nowhere near the egregious markup of egress bandwidth.


85% cheaper seems little. In my case (collocation) bandwidth is 95% cheaper (i.e AWS is 20x as expensive) than AWS.


A quick question.

Have you also included:

  - storage costs (equivalent of EBS, S3 and Glacier) and
  - cost of analytics pipelines (equivalent of EMR, Athena, SageMaker, ...)
in the above price comparison?

Would you have some insights there? Thanks.


Typical cloud bandwidth pricing is known as "roach motel pricing" after the old roach motel pest control slogan of "roaches check in but they never check out." The idea is to make ingress free but egress expensive to make it easy to move all your data in but costly and hard to move it out.


Meh cpus/ram most likely have lighter margins for cloud vendors compared to storage and bandwidth (which is just ridiculous) but they have to keep a lot of spare capacity for scaling so if you don’t scale by much you can do way better on metal even on cpus


I am very ignorant of the current ARM cloud offerings. Is it similar prices but less power usage? Or have they had to ramp up the watts to compete?


not sure about pricing here, but graviton on aws has generally offered more performance at a lower price point, which is likely linked to lower power usage and perhaps lower cost of custom silicon vs intel.

the notion, "the cloud is expensive", ignores the fact that the cloud is not just rented hardware, but staff, facilities, planning, management, etc..

there are businesses where it makes more sense to own hardware and employ your own staff, but if you just want generic compute and storage, you're unlikely to do it as well for less.

also you cannot easily source arm hardware commercially, there is the honeycomb lx2, and its' lead time is months for a single unit. if you want hundreds of nodes, you're gonna use a cloud provider who manufactures their own silicon.


> you cannot easily source arm hardware commercially

Buy Mac Minis and run virtual linux inside?


You can (in theory) run Linux straight on Apple Silicon. It's not locked like the iPad and iPhone. The M1 Ultra would be a pretty solid high volume server if you can manage to plug a 10-40gbps LAN dongle into it. I believe USB-C and/or thunderbolt connectors like that exist.

In theory so far because I'm not sure if there are mature installers and such yet.


The Asahi Linux installer worked flawlessly on my MacBook Pro M1 Max, though it asks you to do some low level things like repartitioning the drive.


I've been passively curious about this for a homelab--does linux virtualize on apple silicon today? I was of the impression this didn't work when they announced m1 in 2020, but I could be misremembering.


Absolutely. It worked from day one. There are issues with emulating x86 Linux, but that’s a different story. For ARM Linux — just ordinary qemu works fine.


Oh, interesting. That's great to hear!


> the notion, "the cloud is expensive", ignores the fact that the cloud is not just rented hardware, but staff, facilities, planning, management, etc..

Not at all. Any organization that runs its datacenters can calculate a TCO.

It's really business 101.


Dell and HP are still nowhere to be seen in the on-prem Arm server market. Is there not enough of a cost advantage for there to be demand?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: