Hyperlight: Virtual machine-based security for functions at scale

yoshuaw · 2024-11-07T17:01:31 1730998891

The Azure Upstream team has been working on a really fast hypervisor library written in Rust for the past three years. It does less than you'd conventionally do with hypervisors, but in turn it can start VMs around 2 orders of magnitude faster (around 1-2ms/VM).

I think this is really cool, and the library was just released on GitHub for anyone to try. I’m happy I got to help them write their announcement post — and I figured this might be interesting for folks here!

dangoodmanUT · 2024-11-07T22:20:50 1731018050

Do you think requiring to use their packages is too limiting for widespread usage? Seems like you're forced to use Rust or C atm.

This seems like a limitation that sits in a somewhat unusable place: For something simple and platform-specific (e.g. a HTTP transform) we can just use JS where the boot time perf makes up for the execution perf, and for something more serious like a full-fledged API 120ms should be more than enough time (and we can preemtively scale as long as we're not already at 0)

yoshuaw · 2024-11-07T23:54:11 1731023651

The way to think about Hyperlight is as a security substrate intended to host application runtimes. You’re right that the Hyperlight API only supports C and Rust today — but you can use that to for example load Python or JS runtimes which can then execute those languages natively.

But we might be able to do even better than that by leveraging Wasm Components [1] and WASI 0.2 [2]. Using a VM guest based on Wasmtime, suddenly it becomes possible to run functions written in any language that can compile to Wasm Components — all using standard tooling and interfaces.

I believe the team has a prototype VM guest based on Wasmtime working, but they still needed a little more time before it’s ready to be published. Stay tuned for future announcements?

[1]: https://component-model.bytecodealliance.org/introduction.ht...

[2]: https://wasi.dev

fwsgonzo · 2024-11-08T17:33:51 1731087231

Looks like my TinyKVM project, except it runs specialized programs instead of regular ELFs? TinyKVM also runs functions, with a fast execution timeout. I proved that without I/O you can essentially run KVM programs with native performance, and sometimes more due to automatic hugepages. I measured LLMs to run at 99.7% native speed using eg. Mistral 7B. For example, the STREAM memory benchmark doesn't use hugepages by default, and so the terminal version runs slower than the TinyKVM version due to hugepage-tables, but of course runs at the same speed once you modify the benchmark to use the same advantage. However, it does require modifying the program.

See: https://ieeexplore.ieee.org/document/10475832

I also implemented VM resets using page-table rewrites and CoW memory sharing, so that no memory is shared across different requests. This can be implemented as tail-latency in a cache.

I ended up adding support for most languages. All the systems languages, Go, v8, LuaJit etc. Go was by far the most annoying to support as it uses signals.

generalizations · 2024-11-10T15:48:31 1731253711

I don't have access to that paper - and when I looked for TinyKVM all I found was the rpi-based project that uses the other definition of KVM. Is your project online somewhere? Or is it proprietary?

fwsgonzo · 2024-11-10T18:10:52 1731262252

I can't publish/open-source it, sadly. But the paper I can share: https://www.dropbox.com/scl/fi/38e0la5m6zkc04tlm03w8/Introdu...

bjconlan · 2024-11-13T07:05:25 1731481525

Also appreciate the reference. I just realized you're the libriscv author (and as pointed out includeOS contributor). Love all your work!

generalizations · 2024-11-11T16:13:41 1731341621

That's cool. Thanks dude.

zekrioca · 2024-11-09T05:57:06 1731131826

Wouldn’t containers provide the same effect as TinyKVM?

fwsgonzo · 2024-11-09T08:19:56 1731140396

Yes, if you don't need sandboxing then containers are just way easier to deal with. Although I wish they didn't use so much space.

zekrioca · 2024-11-09T09:14:12 1731143652

Why couldn’t one mathematically recreate the limitations of a VM through a namespace by means of SELinux?

kevincox · 2024-11-10T11:34:11 1731238451

Because the Linux kernel is incredibly complicated and shouldn't be trusted as a strong security boundary. A simple hypervisor likes has far fewer vulnerabilities so is an easier to trust boundary. They are in very different security tiers.

I would summarize as containers are good for mostly trusted isolation (teams within a company, purchased software) VMs are good for general untrusted software (different tenants in a cloud provider) and separate physical hardware is for scenarios where attacks are likely (military, known malicious code). Of course use cases are very fuzzy, but I wouldn't run fully untrusted code in the same kernel as anything of value.

intelVISA · 2024-11-09T02:45:48 1731120348

Nice project, yeah this looks like a hobbled (in true MS fashion) version of TinyKVM!

Were you inspired by includeOS, Mirage, or similar?

fwsgonzo · 2024-11-09T08:19:16 1731140356

I wrote the IncludeOS kernel bits

generalizations · 2024-11-07T19:01:02 1731006062

> These micro VMs operate without a kernel or operating system, keeping overhead low. Instead, guests are built specifically for Hyperlight using the Hyperlight Guest library, which provides a controlled set of APIs that facilitate interaction between host and guest

Sounds like this is closer to a chroot/unikernel than a "micro VM" - a slightly more firewalled chroot without most of the os libs, or a unikernel without the kernel. Pretty sure it's not a "virtual machine" though.

Only pointing this out because these sorts of containers/unikernels/vms exist on a spectrum, and each type carries its own strengths and limitations; calling this by the wrong name associates it with the wrong set of tradeoffs.

wmf · 2024-11-07T19:30:41 1731007841

I guess if it uses CR3 it's a "process" and if it uses VMLAUNCH it's a "VM".

generalizations · 2024-11-07T19:48:02 1731008882

Heh. Going by that delineation we end up with very VM-ish containers and (now) very container-ish VMs. Though this seems like it's even more stripped down than a unikernel - which would also be a "VM" here.

0cf8612b2e1e · 2024-11-07T19:37:51 1731008271

I thought a chroot was not considered a real security boundary?

ronsor · 2024-11-07T22:53:15 1731019995

Chroot is a real security boundary as long as you use it properly. That said, namespaces on Linux are much superior at this point, so I can only recommend using `chroot` for POSIX compliance.

derefr · 2024-11-08T01:54:16 1731030856

chroot is great for all sorts of things, but they're not security-related.

A lot of tools expect to do things to "your system" at absolute paths — chroot lets those tools operate against an explicitly wired-up semi-virtualized simulacra of your system, designed to pass through just the parts of those operations you want to your real host, while routing the rest of the effects into a "rootfs in a can", that you're either building up, or will immediately throw away.

Think: debootstrap; or pivot-root; or mounting your rootfs to fix your GRUB config and re-run update-grub from your initramfs rescue shell.

kevincox · 2024-11-10T11:36:06 1731238566

Yes. Anything that shares a kernel is a very weak security boundary as the kernel is complex and vulnerabilities are regularly discovered.

oneplane · 2024-11-08T00:51:54 1731027114

So in essence, this is somewhere between a unikernel+firecracker combo and a WASM module, but using VT.

apitman · 2024-11-07T19:31:55 1731007915

Don't see any mention of firecracker, which is the first thing I think of in this space. Anyone have a TL;DR comparison?

eyberg · 2024-11-07T22:04:29 1731017069

Firecracker can run ordinary linux/GPOS vms and unikernels.

Unikernels can run inside of firecracker.

Unikernels are focused on single applications whereas general purpose operating systems are focused on multiple applications.

This is focused on running functions embedded inside a host program. So it is fairly different than other things out there and in a class of its own.

ATechGuy · 2024-11-07T22:29:33 1731018573

> each function request to have its own hypervisor for protection.

They are talking about isolating serverless functions, not host program functions. In that sense, it is exactly what Firecracker does for lambda functions

eyberg · 2024-11-07T22:59:55 1731020395

Firecracker boots up a runtime that has a full blown operating system in it - lambda just happens to call a known program with a known function. In that sense sure it provides similar functionality but it's really quite different. That's not what fly uses firecracker for, for instance.

Qemu/firecracker are in the same space - this is different.

These are most definitely in a different boat as you embed the guest functions inside the host program and then you register those functions. Taken from the readme:

> The host can call functions implemented and exposed by the guest (known as guest functions).

> Once running, the guest can call functions implemented and exposed by the host (known as host functions).

This is more in the 'safe plugin' type of space. As with most things in this space - the best way to learn about them is to simply try it out.

rwmj · 2024-11-08T12:03:19 1731067399

libkrun (on Linux) is probably a closer comparison (though still not quite the same). https://github.com/containers/libkrun

stogot · 2024-11-08T05:09:21 1731042561

> The host can call functions implemented and exposed by the guest (known as guest functions).

Can you explain this a bit more? Why/when would a developer want to do this? What’s the advantage over firecracker?

dboreham · 2024-11-08T12:50:11 1731070211

It's faster (shorter start time).

spai2 · 2024-11-08T22:57:14 1731106634

How does the micro VM's guest API talk to the host process? Does the communication between the two have to go through the hypervisor?

spankalee · 2024-11-08T00:41:51 1731026511

They mention that most guests are expected to run code in a VM/interpreter... I wonder if they have a build of V8 or JSC for their environment?

yoshuaw · 2024-11-08T13:11:58 1731071518

I believe the team has a working build of JerryScript [1] to test out the C bindings, but I’m not sure that will be released.

My understanding is that work on the Wasmtime VM guest is ongoing, which will enable Hyperlight to run the StarlingMonkey engine [2]. This is a WebAssembly build of Firefox’s SpiderMonkey engine which was donated by Fastly to the Bytecode Alliance.

That said though, I agree it would be great to see runtimes like V8 and JSC run directly on Hyperlight. There are good reasons why people might prefer those over StarlingMonkey (compat comes to mind), and it would be neat to see how much faster they could start compared to conventional VM deployments.

[1]: https://jerryscript.net/

[2]: https://github.com/bytecodealliance/StarlingMonkey

u8080 · 2024-11-08T22:27:45 1731104865

So in general this is kludge to implement app isolation via "VM", because existing CPU architectures suck at isolating code?

sim7c00 · 2024-11-08T09:53:43 1731059623

i wondered how it worked in rust but the guest entrypoint>init>main is wrapped in unsafeblock as is a lot of other low level operations it does. interesting stuff

broknbottle · 2024-11-09T02:17:52 1731118672

Cool to see them using just

7e · 2024-11-08T04:12:44 1731039164

Use CHERI for this?