So basically, the latency from "CreateInstance" to "I can SSH in" for the fastest AMI is roughly 3 seconds, with max being 8 seconds.
That's actually pretty solid. If it were 5-10x faster than that you could probably fit that into a lot of interesting use-cases for on-demand workloads. The bottleneck is the EC2 allocation itself, and I'd be interested in seeing what warm EC2 instances can do for you there.
That said, I think for the majority of use cases boot performance is not particularly important. If you want really fast 'boot' you might just want VMs and containers - that would cut out the 3-8 seconds of instance allocation time, as well as most of the rest of the work.
Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.
>If it were 5-10x faster than that you could probably fit that into a lot of interesting use-cases for on-demand workloads
That's basically lambda although you lose control of the kernel and some of the userspace (although you can use Docker containers and the HTTP interface on Lambda to get some flexibility back)
Under the hood, Lambda uses optimized Firecracker VMs for < 1 sec boot
>I think for the majority of use cases boot performance is not particularly important
Anything with auto scaling. I think CI is probably a big use case (those get torn up and down pretty quickly) and you introduce hairy things like Docker in Docker unprivileged builds trying to run nested in a container
Yeah, CI is a good use case. Even autoscaling I kinda feel like you need to be a lot faster to make a huge difference tbh though.
And yeah, Firecracker is pretty sick, but it's also something you can just use yourself on ec2 metal instances, and then you get full control over the kernel and networking too, which is neat.
There's no fundamental reason Lambda needs to make you lose control of the kernel; I'd love to have Lambda with a custom kernel, and that doesn't seem like it'd be too difficult to support.
At one point, Lambda didn't expose the ability to write custom runtimes, and only provided runtimes for specific languages. People reverse-engineered those runtimes and figured out how to build custom ones. Eventually, Amazon provided a documented, well-supported way to build and run custom runtimes, but that required documenting the interfaces provided to those runtimes (e.g. environment variables and HTTP APIs instead of language-runtime-specific APIs).
I'd love to see Lambda support custom kernels. That would require a similar approach: document the actual VM interface provided to the kernel, including the virtual hardware and the mechanism used to expose the Lambda API. I'd guess that they haven't yet due to a combination of 1) not enough demand for Lambda with custom kernels and 2) the freedom to modify the kernel/hardware interface arbitrarily because they control both sides of it.
I've seen that, but I wonder to what extent they've done custom work and to what extent they've just used established Kconfig options to compile out surface area they're not using.
In any case, Firecracker+Nitro seem like they'd be a sufficient security boundary.
Do you know if there is any plans on FreeBSD creating a super minimal server version that can be used as a docker host ... similar in size to Alpine Linux.
I know lots of people have made stripped down versions of FreeBSD, e.g. nanobsd. Beyond that, I'm not aware of anything specific... but I probably wouldn't be since I don't pay much attention to the container space. Try asking on one of the mailing lists maybe?
> The first two values — the time taken for a RunInstances API call to successfully return, and the time taken after RunInstances returns before a DescribeInstances call says that the instance is "running" — are consistent across all the AMIs I tested, at roughly 1.5 and 6.9 seconds respectively
“Running to available” is what’s in the table, ranging from 1.23s to 70 or so.
>use-cases for on-demand workloads.
Yea true ! Maybe depending on the availability maybe they can "spawn up" some x-amount of "spare servers" to get the median time even lower..
> Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.
FreeBSD rc executes all rc.d stripts sequentially, in one thread. OpenRC AFAIK can start daemons in parallel, but unfortunately switch to OpenRC was abandoned: https://reviews.freebsd.org/D18578
Oh, that explains! I thought somehow the driver installation broke despite having made an AMI of the same instance type with drivers already installed, and so I just run the install shell script every time and then it works. I figured maybe some hardware address is wrong or it's subtly different hardware, but it being time-based makes a lot more sense.
Absolutely, same here. Additionally there seems to be very little consistency with GPU instance start up times, I’ve had 30 seconds one moment and 5 minutes another. Can’t say I’ve experienced 10 minutes luckily.
Yeah, less than a few seconds with spot consistently would be nice, but i've never seen it. When I was handling autoscaling via the ec2 api + python + ngnix, the daemon I wrote pretty much had to have a while loop to continuously check connectivity via ssh after a t3a.medium (with ubuntu) was kicked off from `ec2.request_spot_instances`
Perhaps I should have been using "Clear Linux 34640"
Probably depends on your spot price.. we always set maximum spot price at the price of “on-demand” instances but due to how priorities work it may also sometimes not work when the data center in given region is out of capacity for given type.
TBH the bigger issue than boottime for AWS is lack of physical resources to fulfill the demand - this is what most big players are struggling with.
No no, the spot price matches, the instance boots, you can ssh to it and do everything except using the GPU. E.g. you try to run your pytorch NN training, but it freezes for 5-10 minutes, then it runs fine. If you start your training again, it runs immediately.
I wonder why there isn't a well known firm running these kinds of benchmarks and "true" uptime reporting for the major clouds -- knowing perf stats for S3 or SQS or Kinesis etc in various geos and AZs seems like it would be useful.
It's not a big edge, but knowing about downtime before others could certainly be an edge
Wow, up until now I only associated Gartner with the magic quadrant, had no idea they had this.
It doesn't seem like they'll give you up to the hour information and you do have to pay for "Gartner for Professionals" which I'm not familiar with but it's definitely out there.
The key to this has to be supporting it cross cloud IMO -- people have the AWS vs GCP vs Azure conversation so often and I think most people know that GCP is best performance for the buck, but the other reasons to pick the other two are numerous. Might be nice to have some numbers on just how much performance differs against the others since that's one of more quantifiable things to weigh.
I've been doing a lot of work optimizing EC2 boot time lately. I can get an instance up and connecting outbound to a network port in a few hundred milliseconds, with plenty of room left to optimize there. But all instances do indeed seem to take at least 8 seconds to come up on the AWS end, as this article points out. This has been why I haven't put much effort into optimizing further, because that 8 seconds is entirely out of my hands.
I've tried changing the various parameters under my control: assigning a specific network address rather than making AWS pick one from the VPC, disabling metadata, using a launch template, not using a launch template, using the smallest possible AMI (just a kernel and static binary), using Fast Snapshot Restore on the snapshot in the AMI, etc. None of those makes the slightest difference in instance start time; still ~8 seconds.
would be interested to know the 95th percentiles as well as median (calculated over more runs than 10 I suppose). Its often worst case times that are important rather than average or median.
This. I'm surprised that even people who are very well-versed in math omit statistics other than an average or, at best, a median. Is it really just me who wonders about, and in this case really expects, variability in the value? Especially given the huge differences, you might have had a few slow or a few lucky starts that shifted the middle value. Probably they saw the raw data and didn't notice such huge spikes and would have changed the method if they had, but it's not even mentioned so this is pure speculation on my part.
That's for the automotive embedded version of Linux they created. The AMI isn't quite the same, but they have a kernel specifically patched and optimized to run in a kvm hypervisor, they're using systemd-boot, which is a bit faster than GRUB, which is likely the bootloader these other distros are using.
What else they did would probably require booting into Clear Linux myself and checking it out, but disabling services you don't need so systemd isn't wasting time, baking all kernel modules you know you'll need directly into the kernel instead of storing them as modules, and not using initramfs if you don't need it is probably the "general" answer for how to get a faster boot.
If you really want ultra-fast, you can turn Linux into a unikernel by compiling the kernel with your application as init, not using any other services at all, and booting via the EFISTUB instead of using a bootloader at all. I don't know if the AWS EC2 services provides any way of doing that, though. On your own machine, you can either use efibootmgr from a different booted system on the same BIOS, or by setting boot order in the BIOS, or naming the EFISTUB whatever that magic name is that EFI automatically loads if a boot order isn't configured in NVMEM.
Restructured the boot sequence and probably stripped everything which did not fit in a server / cloud context, also mayhaps removed everything which was not necessary from the base. Not unlike openbsd.
Also would be interesting comparing regions. us-east-1 is by far the largest so would be interesting to see what, say, us-west-1 or us-east-2 look like
Median is an odd metric to use. I think I might truly be more interested in the mean; maybe the geometric mean.
I am also very interested in worst-cases with this sort of thing, so the 80th and 95th and 100th percentiles would be helpful.
It would be interesting to see plots of distributions, as well: are boot times really unimodal? They seem like they could easily have multiple modes, which could really mess with all of these measures (but especially the median!).
Because the script is open source (thank you!) I may try this myself.
Network glitches will have a one-sided effect: they will make you slower, but never faster. Trimming out the top 25% slowest measurements, or even just using the minimum, can be more effective. Those are common when profiling programs for exactly that reason - the median execution time for a loop might be influenced by OS weirdness like some background job, but the minimum will hopefully really just measure the loop.
Anyway - probably not a big deal, probably mostly academic, although modal behavior would be really interesting.
Yes and no. If there's a network glitch while we're polling to see if the instance is "running", it will result in the "pending-to-running" time being high but the "running-to-port-closed" time being low.
I wonder if EFI boot with something like efistub would be a even faster. Afaik EC2 BIOS boots which has some fixed overhead (although Clear Linux still manages to boot very quickly)
Also on boot, I /think/ EC2 pulls down the AMI to EBS from S3 so theoretically smaller AMIs might be faster but not sure
My experience with desktop hardware is that traditional BIOS boots faster than EFI, because the former does much less as well. (This is not the difference between "BIOS boot" and "EFI boot" options in an EFI BIOS, but rather a "pure native" BIOS vs "pure native" EFI with the same motherboard.)
Was that with or without a bootloader? My understanding was EFI was marginally faster to initialize (which might not be true based on what you're saying) but you could also skip the bootloader and load the kernel directly (EFISTUB) which would also eliminate initram (which seems fine assuming you're using vanilla VMs on popular hypervisors/providers)
I've tested both EFI and BIOS boots on EC2, and they have about the same overhead.
And I've tested AMIs that are no bigger than a Linux kernel and a single static binary; times are still on par with what's listed here. Still takes 7-8 seconds before the first user-controlled instruction gets to run.
That's actually pretty solid. If it were 5-10x faster than that you could probably fit that into a lot of interesting use-cases for on-demand workloads. The bottleneck is the EC2 allocation itself, and I'd be interested in seeing what warm EC2 instances can do for you there.
That said, I think for the majority of use cases boot performance is not particularly important. If you want really fast 'boot' you might just want VMs and containers - that would cut out the 3-8 seconds of instance allocation time, as well as most of the rest of the work.
Curious to see a follow up on what's going on with FreeBSD - seems like it takes ages to get the network up.