It's covered in the paper, but the origins are the "crosvm" project at Google, which is used in ChromeOS. Firecracker started as a fork of crosvm, removed a bunch of code they didn't need, and of course adding their own.
> The rust-vmm project is organized as a shared effort, shared ownership open-source project that includes (so far) contributors from Alibaba, AWS, Cloud Base, Crowdstrike, Intel, Google, Red Hat as well as individual contributors.
Thanks for pointing out rust-vmm, Steve. Andreea (one of my co-authors on this paper) wrote an article about it a while back, which explains how it fits into the bigger picture: https://opensource.com/article/19/3/rust-virtual-machine
The comparison with QEMU is a bit disingenuous. The number of lines of code in a given binary is much less than 1.4 million, which is the total for all the architectures that QEMU supports (actually it's closer to 2 million). It is also possible to configure out a lot of the code and libraries. A default build is around 6-700.000 lines of code, and it is possible to build reduced versions that tally less than 300.000 lines just by tweaking a configuration file with the list of included devices (which was done for the experiments in the paper). The NEMU project mentioned in the paper is inactive exactly because it was a dead end: it was possible to achieve all their goals directly in QEMU without the need for a fork.
Likewise, the number of syscalls in QEMU (270) is quoted from another paper but likely refers not to QEMU used as a VMM, but rather to the so called "user-mode emulation" that runs foreign Linux binaries (and which, by its very nature, invokes pretty much all Linux syscalls including quite a few obsolete ones).
It would be interesting to have more information on the configuration they use for QEMU, especially whether they are enabling PCI and ACPI. Version 4.2 of QEMU (the version they used) has a trimmed virtual machine type that was admittedly inspired by Firecracker and actually boots even faster than Firecracker, or at least in the noise. [1] The code specific to this machine type is only 500 lines. Also since that release QEMU and qboot _are_ actually able to boot uncompressed kernels using the PVH entry point, contrarily to what the paper states. This could explain the difference in boot time performance.
Memory consumption is tricky, QEMU uses green threads and has to allocate stacks for them. This can show up as large mmaps but they do not correspond to actual memory usage. But again without redoing the experiments it's hard to say if that is the cause. I have no doubt Firecracker is leaner in this respect, anyway, and it's an important metric for Amazon.
That said, I am all for competition, and some of the measurements in the paper are certainly worth a look to see if there is more low hanging fruit to pick, especially with respect to memory consumption. QEMU is a complex program, as demonstrated by the complexity of measuring it accurately, and Firecracker is a very interesting project. I am very happy to collaborate with the authors of the paper on rust-vmm.
> QEMU is a complex program, as demonstrated by the complexity of measuring it accurately
For sure. Our goal was a fair comparison, but any comparison by one set of criteria is going to paint an incomplete picture, especially of something like QEMU that does a lot and has a lot of buttons and knobs. This is further complicated by doing the comparison over a time span where both are moving targets (which is a good thing, both groups have been building cool stuff). Comparing software qualitatively is hard.
I'm very excited about Firecracker and rust-vmm. I'm mostly excited about the amount of energy and excitement in this space right now. That's only going to lead to good things down the road, and there's plenty of room for multiple threads of innovation.
> The NEMU project mentioned in the paper is inactive exactly because it was a dead end: it was possible to achieve all their goals directly in QEMU without the need for a fork.
That's very interesting, I didn't know that background. I saw the loss of velocity in the project, but couldn't find any details of why.
I am and this is awesome! Thanks for sharing it. And thanks for using the latest version of QEMU and the microvm machine type, that did make the comparison much fairer already.
> I saw the loss of velocity in the project, but couldn't find any details of why.
The Intel people moved on to work on Cloud-Hypervisor and a lot of the stuff done on NEMU was reworked (leading e.g. to the microvm machine type) and added to QEMU (such as disabling TCG on ARM). Before that however we collaborated to complete the configuration mechanism for disabling unwanted devices. [1] And we are still collaborating on rust-vmm, which is taking inspiration from QEMU for the more advanced parts of the design!
I'm really excited to see where this goes in the community. There are some interesting projects using Firecracker such as Weave Ignite and firekube that I think could improve the security of Kubernetes. Also lightweight VMs are exciting in their own right.
They're almost sort of opposites? Firekube is the Kube control plane running on lightweight Firecracker-powered VMs. On the other hand, Kata Containers is a CRI runtime (like containerd or CRIO) that allows Kubernetes to schedule containers to start through Kata (which then allows the container to run via Firecracker or Qemu, IIUC). But also, I think in practice everyone uses Containerd/CRIO and then configure them to pass untrusted workloads onto Kata.
I'm not sure what to think of how all of this ended up.
We have been using it to run e2e tests by creating multiple VMs that need to talk to each other and load some kernel modules. It works so much faster than relying on cloud providers or alternatives like Docker in Docker. Very good impressions so far :)
It's covered in the paper, but the origins are the "crosvm" project at Google, which is used in ChromeOS. Firecracker started as a fork of crosvm, removed a bunch of code they didn't need, and of course adding their own.
But rather than just fork, they've consolidated the shared parts into the "rust-vmm" project: https://github.com/rust-vmm/community
> The rust-vmm project is organized as a shared effort, shared ownership open-source project that includes (so far) contributors from Alibaba, AWS, Cloud Base, Crowdstrike, Intel, Google, Red Hat as well as individual contributors.
This has allowed other similar tech to grow, like https://github.com/cloud-hypervisor/cloud-hypervisor, which seems to be the Intel project mentioned in the above sentence.