EC2 Boot Time Benchmarking

staticassertion · on Aug 17, 2021

So basically, the latency from "CreateInstance" to "I can SSH in" for the fastest AMI is roughly 3 seconds, with max being 8 seconds.

That's actually pretty solid. If it were 5-10x faster than that you could probably fit that into a lot of interesting use-cases for on-demand workloads. The bottleneck is the EC2 allocation itself, and I'd be interested in seeing what warm EC2 instances can do for you there.

That said, I think for the majority of use cases boot performance is not particularly important. If you want really fast 'boot' you might just want VMs and containers - that would cut out the 3-8 seconds of instance allocation time, as well as most of the rest of the work.

Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.

nijave · on Aug 17, 2021

>If it were 5-10x faster than that you could probably fit that into a lot of interesting use-cases for on-demand workloads

That's basically lambda although you lose control of the kernel and some of the userspace (although you can use Docker containers and the HTTP interface on Lambda to get some flexibility back)

Under the hood, Lambda uses optimized Firecracker VMs for < 1 sec boot

>I think for the majority of use cases boot performance is not particularly important

Anything with auto scaling. I think CI is probably a big use case (those get torn up and down pretty quickly) and you introduce hairy things like Docker in Docker unprivileged builds trying to run nested in a container

staticassertion · on Aug 18, 2021

Yeah, CI is a good use case. Even autoscaling I kinda feel like you need to be a lot faster to make a huge difference tbh though.

And yeah, Firecracker is pretty sick, but it's also something you can just use yourself on ec2 metal instances, and then you get full control over the kernel and networking too, which is neat.

JoshTriplett · on Aug 17, 2021

There's no fundamental reason Lambda needs to make you lose control of the kernel; I'd love to have Lambda with a custom kernel, and that doesn't seem like it'd be too difficult to support.

nuclearnice1 · on Aug 17, 2021

You can do lambda with containers which should get you close, I think.

JoshTriplett · on Aug 17, 2021

A container does not include the kernel, so it doesn't get any closer. I just want a single static binary and a kernel, not a full container.

staticassertion · on Aug 18, 2021

I wonder why they don't expose a kernel instead of just the rootfs. It's hard to imagine a great reason. Maybe they harden their guest kernel?

JoshTriplett · on Aug 18, 2021

At one point, Lambda didn't expose the ability to write custom runtimes, and only provided runtimes for specific languages. People reverse-engineered those runtimes and figured out how to build custom ones. Eventually, Amazon provided a documented, well-supported way to build and run custom runtimes, but that required documenting the interfaces provided to those runtimes (e.g. environment variables and HTTP APIs instead of language-runtime-specific APIs).

I'd love to see Lambda support custom kernels. That would require a similar approach: document the actual VM interface provided to the kernel, including the virtual hardware and the mechanism used to expose the Lambda API. I'd guess that they haven't yet due to a combination of 1) not enough demand for Lambda with custom kernels and 2) the freedom to modify the kernel/hardware interface arbitrarily because they control both sides of it.

staticassertion · on Aug 18, 2021

Yeah I'd bet that it's just a "haven't justified this work yet" kinda thing. We just run Firecracker VMs ourselves.

nuclearnice1 · on Aug 20, 2021

Interesting.

How does that work?

How do Firecracker VMs differ from containers on lambda or fargate?

nijave · on Aug 19, 2021

Lambda uses a stripped down Linux kernel (afaik it has some syscalls removed)

The kernel surface is part of their security model. There's some details here https://www.bschaatsbergen.com/behind-the-scenes-lambda

E.g. exposing the kernel would undo some intentional isolation

JoshTriplett · on Aug 19, 2021

I've seen that, but I wonder to what extent they've done custom work and to what extent they've just used established Kconfig options to compile out surface area they're not using.

In any case, Firecracker+Nitro seem like they'd be a sufficient security boundary.

cperciva · on Aug 17, 2021

Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.

Our BIOS boot loader is very slow. I'll be writing about FreeBSD boot performance in a later post.

alberth · on Aug 17, 2021

Hi Colin

Do you know if there is any plans on FreeBSD creating a super minimal server version that can be used as a docker host ... similar in size to Alpine Linux.

cperciva · on Aug 17, 2021

I know lots of people have made stripped down versions of FreeBSD, e.g. nanobsd. Beyond that, I'm not aware of anything specific... but I probably wouldn't be since I don't pay much attention to the container space. Try asking on one of the mailing lists maybe?

sjnu · on Aug 17, 2021

Where did you get 3? The fastest numbers I could see in the post add up to 7-8s

masklinn · on Aug 17, 2021

Per the post it’s 8.4 and independent of the OS:

> The first two values — the time taken for a RunInstances API call to successfully return, and the time taken after RunInstances returns before a DescribeInstances call says that the instance is "running" — are consistent across all the AMIs I tested, at roughly 1.5 and 6.9 seconds respectively

“Running to available” is what’s in the table, ranging from 1.23s to 70 or so.

rawoke083600 · on Aug 17, 2021

>use-cases for on-demand workloads. Yea true ! Maybe depending on the availability maybe they can "spawn up" some x-amount of "spare servers" to get the median time even lower..

citrin_ru · on Aug 17, 2021

> Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.

FreeBSD rc executes all rc.d stripts sequentially, in one thread. OpenRC AFAIK can start daemons in parallel, but unfortunately switch to OpenRC was abandoned: https://reviews.freebsd.org/D18578

lajosbacs · on Aug 17, 2021

Benchmark that would interest me would be time from

"launch GPU spot instance" to "I can actually use the GPU, at least by running nvidia-smi".

I find that this can take up to 10 minutes and for the more expensive instances, this can mean non-negligible amount of money.

Great article BTW

lucb1e · on Aug 17, 2021

Oh, that explains! I thought somehow the driver installation broke despite having made an AMI of the same instance type with drivers already installed, and so I just run the install shell script every time and then it works. I figured maybe some hardware address is wrong or it's subtly different hardware, but it being time-based makes a lot more sense.

ardenpm · on Aug 17, 2021

Absolutely, same here. Additionally there seems to be very little consistency with GPU instance start up times, I’ve had 30 seconds one moment and 5 minutes another. Can’t say I’ve experienced 10 minutes luckily.

cinquemb · on Aug 17, 2021

Yeah, less than a few seconds with spot consistently would be nice, but i've never seen it. When I was handling autoscaling via the ec2 api + python + ngnix, the daemon I wrote pretty much had to have a while loop to continuously check connectivity via ssh after a t3a.medium (with ubuntu) was kicked off from `ec2.request_spot_instances`

Perhaps I should have been using "Clear Linux 34640"

pojzon · on Aug 17, 2021

Probably depends on your spot price.. we always set maximum spot price at the price of “on-demand” instances but due to how priorities work it may also sometimes not work when the data center in given region is out of capacity for given type.

TBH the bigger issue than boottime for AWS is lack of physical resources to fulfill the demand - this is what most big players are struggling with.

lajosbacs · on Aug 17, 2021

No no, the spot price matches, the instance boots, you can ssh to it and do everything except using the GPU. E.g. you try to run your pytorch NN training, but it freezes for 5-10 minutes, then it runs fine. If you start your training again, it runs immediately.

hardwaresofton · on Aug 17, 2021

I wonder why there isn't a well known firm running these kinds of benchmarks and "true" uptime reporting for the major clouds -- knowing perf stats for S3 or SQS or Kinesis etc in various geos and AZs seems like it would be useful.

It's not a big edge, but knowing about downtime before others could certainly be an edge

sixhobbits · on Aug 17, 2021

Gartner does exactly this - including boot times which are calculated using similar metrics (but over longer time periods and with more runs).

Not sure if the data is freely available, but I used to be part of the team at AWS that calculated these. See [0].

[0] https://www.gartner.com/en/cloud-decisions/benchmark-library

hardwaresofton · on Aug 17, 2021

Wow, up until now I only associated Gartner with the magic quadrant, had no idea they had this.

It doesn't seem like they'll give you up to the hour information and you do have to pay for "Gartner for Professionals" which I'm not familiar with but it's definitely out there.

The key to this has to be supporting it cross cloud IMO -- people have the AWS vs GCP vs Azure conversation so often and I think most people know that GCP is best performance for the buck, but the other reasons to pick the other two are numerous. Might be nice to have some numbers on just how much performance differs against the others since that's one of more quantifiable things to weigh.

JoshTriplett · on Aug 17, 2021

Do you know what takes the majority of those 8 seconds on AWS?

JoshTriplett · on Aug 17, 2021

I've been doing a lot of work optimizing EC2 boot time lately. I can get an instance up and connecting outbound to a network port in a few hundred milliseconds, with plenty of room left to optimize there. But all instances do indeed seem to take at least 8 seconds to come up on the AWS end, as this article points out. This has been why I haven't put much effort into optimizing further, because that 8 seconds is entirely out of my hands.

I've tried changing the various parameters under my control: assigning a specific network address rather than making AWS pick one from the VPC, disabling metadata, using a launch template, not using a launch template, using the smallest possible AMI (just a kernel and static binary), using Fast Snapshot Restore on the snapshot in the AMI, etc. None of those makes the slightest difference in instance start time; still ~8 seconds.

zmmmmm · on Aug 17, 2021

would be interested to know the 95th percentiles as well as median (calculated over more runs than 10 I suppose). Its often worst case times that are important rather than average or median.

lucb1e · on Aug 17, 2021

This. I'm surprised that even people who are very well-versed in math omit statistics other than an average or, at best, a median. Is it really just me who wonders about, and in this case really expects, variability in the value? Especially given the huge differences, you might have had a few slow or a few lucky starts that shifted the middle value. Probably they saw the raw data and didn't notice such huge spikes and would have changed the method if they had, but it's not even mentioned so this is pure speculation on my part.

nnx · on Aug 17, 2021

What’s the secret of Clear Linux to boot so fast compared to other distributions?

aeyes · on Aug 17, 2021

Here is a presentation from 2019 showing how they reduced Kernel boot time from 3s to 300ms: https://www.linuxplumbersconf.org/event/4/contributions/281/...

nonameiguess · on Aug 17, 2021

That's for the automotive embedded version of Linux they created. The AMI isn't quite the same, but they have a kernel specifically patched and optimized to run in a kvm hypervisor, they're using systemd-boot, which is a bit faster than GRUB, which is likely the bootloader these other distros are using.

What else they did would probably require booting into Clear Linux myself and checking it out, but disabling services you don't need so systemd isn't wasting time, baking all kernel modules you know you'll need directly into the kernel instead of storing them as modules, and not using initramfs if you don't need it is probably the "general" answer for how to get a faster boot.

If you really want ultra-fast, you can turn Linux into a unikernel by compiling the kernel with your application as init, not using any other services at all, and booting via the EFISTUB instead of using a bootloader at all. I don't know if the AWS EC2 services provides any way of doing that, though. On your own machine, you can either use efibootmgr from a different booted system on the same BIOS, or by setting boot order in the BIOS, or naming the EFISTUB whatever that magic name is that EFI automatically loads if a boot order isn't configured in NVMEM.

masklinn · on Aug 17, 2021

Restructured the boot sequence and probably stripped everything which did not fit in a server / cloud context, also mayhaps removed everything which was not necessary from the base. Not unlike openbsd.

watermelon0 · on Aug 17, 2021

It would be interesting to know how fast is bottlerocket[0], which is Amazon's optimized OS for running containers, compared to Clear Linux.

[0] https://github.com/bottlerocket-os/bottlerocket

nijave · on Aug 17, 2021

Afaik Clear Linux is the one using Intel's proprietary compiler optimized for Intel CPUs (so it's theoretically generating more efficient code)

masklinn · on Aug 17, 2021

“Better compiler options” don’t translate to a 10x lower “time to accessible”.

paulcarroty · on Aug 17, 2021

They are using gcc: https://github.com/clearlinux-pkgs/gcc

nijave · on Aug 17, 2021

Also would be interesting comparing regions. us-east-1 is by far the largest so would be interesting to see what, say, us-west-1 or us-east-2 look like

spenczar5 · on Aug 17, 2021

Nice to see this.

Median is an odd metric to use. I think I might truly be more interested in the mean; maybe the geometric mean.

I am also very interested in worst-cases with this sort of thing, so the 80th and 95th and 100th percentiles would be helpful.

It would be interesting to see plots of distributions, as well: are boot times really unimodal? They seem like they could easily have multiple modes, which could really mess with all of these measures (but especially the median!).

Because the script is open source (thank you!) I may try this myself.

cperciva · on Aug 17, 2021

I used the median mainly because it was robust against any network glitches which occurred during testing.

That said, I did my final measurements from inside EC2 (rather than from my laptop over wifi) so it probably wasn't necessary.

spenczar5 · on Aug 17, 2021

Network glitches will have a one-sided effect: they will make you slower, but never faster. Trimming out the top 25% slowest measurements, or even just using the minimum, can be more effective. Those are common when profiling programs for exactly that reason - the median execution time for a loop might be influenced by OS weirdness like some background job, but the minimum will hopefully really just measure the loop.

Anyway - probably not a big deal, probably mostly academic, although modal behavior would be really interesting.

cperciva · on Aug 17, 2021

Yes and no. If there's a network glitch while we're polling to see if the instance is "running", it will result in the "pending-to-running" time being high but the "running-to-port-closed" time being low.

kobalsky · on Aug 17, 2021

https://github.com/cperciva/ec2-boot-bench/blob/master/ec2-b...

C is an interesting choice of laguange for this task. They even implemented a rustic XML parser in there.

nijave · on Aug 17, 2021

I wonder if EFI boot with something like efistub would be a even faster. Afaik EC2 BIOS boots which has some fixed overhead (although Clear Linux still manages to boot very quickly)

Also on boot, I /think/ EC2 pulls down the AMI to EBS from S3 so theoretically smaller AMIs might be faster but not sure

userbinator · on Aug 17, 2021

My experience with desktop hardware is that traditional BIOS boots faster than EFI, because the former does much less as well. (This is not the difference between "BIOS boot" and "EFI boot" options in an EFI BIOS, but rather a "pure native" BIOS vs "pure native" EFI with the same motherboard.)

nijave · on Aug 19, 2021

Was that with or without a bootloader? My understanding was EFI was marginally faster to initialize (which might not be true based on what you're saying) but you could also skip the bootloader and load the kernel directly (EFISTUB) which would also eliminate initram (which seems fine assuming you're using vanilla VMs on popular hypervisors/providers)

JoshTriplett · on Aug 17, 2021

I've tested both EFI and BIOS boots on EC2, and they have about the same overhead.

And I've tested AMIs that are no bigger than a Linux kernel and a single static binary; times are still on par with what's listed here. Still takes 7-8 seconds before the first user-controlled instruction gets to run.

nijave · on Aug 19, 2021

Gotcha, thanks for the info

That's a bit disappointing but I guess AWS has spent a lot of time optimizing Lambda vs EC2

miyuru · on Aug 17, 2021

> EC2 pulls down the AMI to EBS from S3 so theoretically smaller AMIs might be faster

I managed to create a 1GB AMI of debian. Will be interesting to test this.

https://blog.miyuru.lk/aws-root-volume/

josnyder · on Aug 17, 2021

I clocked a custom Ubuntu-based AMI at 2.85 seconds for the median of 10 runs.

https://gist.github.com/hashbrowncipher/17a92c6afb9642503876...

glotzerhotze · on Aug 17, 2021

I‘d be interested in the time being spent running user-data modifications, so benchmarking bootstrapping scripts would be possible.

pojzon · on Aug 17, 2021

Isint that strictly depenant on completely different factors like network latency? And kind of bootstrap scripts you wanna run?

Or are you talking about cloud-init which is run by AWS themselves before running userdata scripts ?

usrme · on Aug 17, 2021

Does anybody know if something like this exists for Azure as well?

zbjornson · on Aug 17, 2021

PerfkitBenchmarker has a boot timer like this (time to SSH) that works for all major cloud providers.

https://github.com/GoogleCloudPlatform/PerfKitBenchmarker