Hacker News new | past | comments | ask | show | jobs | submit login
Unikernels: No Longer an Academic Exercise? (250bpm.com)
178 points by rumcajz on Oct 23, 2018 | hide | past | favorite | 108 comments



I'd like to point out that there is a unikernel called IBM's TPF which is something everybody is interacting with daily. It's used in the payment card network by Visa, as well as the airline and hotel industry for reservations. This is typically run under the hypervisors z/VM or PRSM. https://en.wikipedia.org/wiki/Transaction_Processing_Facilit...

A few years ago someone trying to spin container technology did a lot of damage to other attempts at unikernels with marketing dogma, not reality. Claims of non-debugability and other FUD. It's long been standard operating procedure to trace and debug systems from outside the system's view. This is how people do bringup of new chips as well as OS ports. On hardware there is usually a dedicated debug and trace facility as part of the CPU or board support package or a firmware monitor. In a virtualized environment like a unikernel this is way easier because you can run code above the guest's ring 0 supervisor privilege against its RAM/pagetable root. Modern systems like POWER even allow debugging from the BMC, with a plan to allow full gdb sessions over that out of band interface https://github.com/open-power/pdbg.

There's nothing implicitly wrong with unikernels, or other systems software ideas like microkernels, just because they are less popular technology at the moment. I'd encourage people to continue exploring this space.


That FUD was from Bryan Cantrill (https://news.ycombinator.com/item?id=10953766) making aggressively resolute, but also easily falsifiable claims about the production-worthiness of unikernels.

From the perspective of someone who'd debugged and traced unikernels from both inside the runtime (LING) and outside the runtime (xentrace), and who worked in the domain of the very z/TPF system you referenced above, the indictment seemed at-best strangely misguided and at-worst intentionally duplicitous.

As you can imagine the indictment wasn't at all persuasive to me, and thus I keep exploring the space in bits and pieces where applicable.


While acknowledging that your claims about his claims may well be 100% correct, I find it hard to believe that Bryan Cantrill would be "intentionally duplicitous". People, even very smart people, can occasionally be very wrong (btw, I'm not making a claim either way here), and that doesn't mean there is an agenda/intent that is malicious in nature.


Totally unbiased for sure, https://www.joyent.com/smartos


Cantrill is a smart guy. But he has always had a big mouth. And the fact is his post was written with ZERO experience with Unikernels. Zero. He was just assuming they act the way he imagined and attacked that image.

Which is fine if you're upfront about it. He wasn't.


I appreciate the kind words, but let's give me a tad more credit in the experience department, please: I have been doing production OS kernel development for over two decades, and have done non-trivial work in essentially every privileged subsystem across several different microprocessor and OS architectures. If you want to say that I have gobs of experience in kernel development (and more generally at the hardware/software interface), but no experience with unikernels per se, then fine, I guess -- but at the same time, let's acknowledge that you are the CEO of a unikernel company who very much has a dog in the fight?


Absolutely. I would rank you in the very top of developers with experience in development of traditional operating systems. In particular DTrace stands out as an excellent piece of work. It's one of those fundamental advances that serves as inspiration for others.

Now, I'm not sure I agree that I have a horse in the race. I don't necessary believe that there is a race. I've never really been a proponent of the schism between Unikernels and Containers. I struggle to see how Unikernels can offer the same flexibility and ease of deployment as containers. We're likely won't be able to support the vast amounts of runtimes and infrastructure needed to replace something like Docker. Perhaps there could be very specific uses where something like the paper described could be used, but I'm not betting on it.

As a software project IncludeOS has a much narrower target than what people traditionally have thought when thinking of Unikernels. And as a result of of this we're not in the business of replacing neither containers not general purpose operating systems(GPOS). We're aiming to carve out a few niches where we are confident that a GPOS isn't the answer. We're only going to address those needs where we're pretty certain we can actually add some value. Basically we're think we can improve on security in addition to adding real time capability whilst still remaining source-code compatible with Linux (mostly thanks to musl).

My grief is singularly with the myths you helped create that Unikernels are something where you are forces to work with stone age tools and hardly without any tools, except printf, for debugging. We've had to spend a lot of time dispelling these. There are a few other things I believe you where wrong about at the time but I'll spare you the details. Better suited discussions over a beer of coffee.


You never disappoint me in your recorded presentations or now apparently timely HN comments. I am starting to like you just as much as tptacek here bc if someone invokes your name I will expect a witty, if not only informative, rebuttal!


Which we shouldn't forget isn't unbiased given that they have a VM/Container business to sell.


Is LING still alive? Haven't seen much project activity in a while.


Not really, insofar as I can tell, but it was slightly less inactive back then.

It's not the corner of the sandbox that I play in any longer.


What about your view on ultra restrictive OS libraries, like geonode? These take much of the useful optimizations inside unikernals and give a bigger framework to work from.


The Genode project is wonderful. The write-ups they did about integrating their framework with seL4 were unbelievably helpful in my earlier explorations of how to build interesting things atop seL4.

That said, I never really considered Genode in contrast to unikernels to be honest. Something to chew on certainly.


It's a nice exercise, but I don't see t going anywhere.

1st. I don't see that many bugs in the TCP/IP, filesystems, etc. specially those exploitable by black hat hackers, specially remote exploits. We usually see wide-spread vulnerabilities in the middleware like OpenSSH, which is in the userspace, putting everything in the app is not going to solve that, quite the contrary, we need to wait for the library maintainer to fix the bug, for the applications to integrate with the newer version of the library, and then, update all the applications.

2nd. Side-channel attacks. I'm not really confident in letting applications have that low level of hardware access. I mean, it's really easy for one application to steal the resources from another. We need authority if we are sharing peripherals. Well, we have one peripheral already that is shared by processes, that's memory and look at the mess that is a modern MMU and all the attacks on those that we have seen throughout the decades.

3rd. If we are going to implement that as a shared library, the result is exactly the same as today's micro and hybrid kernels, if we are not going to save memory and implement that as a static library, it's exactly the save as virtualization.

Correct me if I'm wrong.


Co-author of the paper referenced in the blog post here.

1. Regarding bugs: There is a lot of code in your monolithic OS kernel. There have been, and will be, a lot of vulnerabilities in that code. Various sandboxing mechanisms notwithstanding, the vast majority of processes running on your system can potentially use all of that code as an attack vector. Unikernels let you switch off access to most of that "host" code, and, through the use of a library OS, contain only the minimal set of libraries needed to run your application in the "guest". Regarding updates: Valid point, but no different from any modern application of substantial complexity, except that now you also have to update (e.g.) the library providing its TCP stack.

2. Unikernels don't make that problem any worse. In fact, if you run on something like Muen (https://muen.sk/), it'll mitigate a bunch of these attacks by giving you a less precise RDTSC at the subject ("VM") level.

3. I don't follow. Implement what?


If the unikernel is running as a normal process, all that potentially vulnerable code is still on the box, and still potentially executable in the event of a vulnerability. The security selling point of a standard unikernel is that the all the potentially-vulnerable code doesn't even exist, because only the libraries that are actually used get compiled and linked into the app, and it has no code relevant to functionality that it's not actually using.

The paper suggests using system-call filtering via seccomp or similar to block off this attack surface, but if you trust seccomp to have zero bugs, you might as well do traditional process sandboxing a la Google Chrome.

The problem unikernels solve is that traditionally, a vulnerability in C code means "game over" and the box is completely pwned with all memory and storage directly accessible, while with a unikernel, a vulnerability means only that an attacker can access other functionality within the app, and can't rely on eg. a shell or a debugger being available to arbitrarily move bytes around.


> If the unikernel is running as a normal process, all that potentially vulnerable code is still on the box [...]

Correct. Conceptually the same applies to all Type 2 hypervisors. Type 1 less so, but you could still potentially exploit Xen and you have all of the dom0 to play around in.

> The paper suggests using system-call filtering via seccomp or similar to block off this attack surface, but if you trust seccomp to have zero bugs, you might as well do traditional process sandboxing a la Google Chrome.

If you do that, you will have to do a line-by-line analysis of the code you want to sandbox in order to determine exactly which syscalls it's using. With the approach presented in our paper the developer does not need to care about this. For example, she can develop her MirageOS unikernel as a normal UNIX process and switch to a guaranteed-to-work-minimal-seccomp sandbox with a simple change of target (build-time configuration option).

But yes, you are now trusting seccomp instead of KVM. I believe in giving people the ability to easily make that choice.

> The problem unikernels solve is that traditionally [...] can't rely on eg. a shell or a debugger being available to arbitrarily move bytes around.

That part does not change with the sandboxing mechanism changing. The stuff that's inside is still only your unikernel.


> you are now trusting seccomp instead of KVM.

Which do you trust more?


That's a very good question. The answer is, "it depends".

For the 80% case, on x86_64, I consider them more or less equivalent. KVM is used daily in anger to provide isolation (e.g. GCE, and now ChromeOS) and has been around much longer but you need to trust hardware virtualization which is a large attack surface on the CPU itself. Given what we've learned about CPU vulnerabilities over the last year, I wouldn't be surprised to find some lurking in the VT-x/SVM implementations.

Seccomp OTOH is difficult to use correctly for arbitrary/existing applications but exposes less of the kernel (depending on your metric, see our paper) and does not need hardware virtualization.

For the 20% case, where the stakes are higher (e.g. High Assurance), I would use something like Muen or SeL4 and run a disaggregated system on top of that.


I feel like (1) might be easier to explain as akin to the JS bundling+tree-shaking process.

I'm aware it's not an entirely accurate metaphor, but it might turn out to be a more accessible one for opening a conversation.


>There is a lot of code in your monolithic OS kernel.

This code is usually there for a reason and a unikernel will have to implement large swaths of this code anyway.

Networking isn't trivial and I don't think anyone will be reimplementing the TCP/IP, UDP/IP, ARP, DHCP and more stacks without breaking in a few bugs themselves.

Example, modern TCP congestion control requires a very precise packet pacing and timing.

Another example, TCP retransmission will be required at some point, the code doing this will have to run side-by-side with the app. This will also be some amount of code.

And all this just piles up and up.

The only real advantage I see for unikernels is that because all the hard work is done by the hypervisor they don't have to bother implementing device drivers and task scheduling but end up either replicating or using the hypervisor's own networking stack and a bunch of other subsystems.


> This code is usually there for a reason and a unikernel will have to implement large swaths of this code anyway.

The largest part of a monolithic kernel today is devices drivers, by far.

> Networking isn't trivial and I don't think anyone will be reimplementing the TCP/IP, UDP/IP, ARP, DHCP and more stacks without breaking in a few bugs themselves.

Sure. And then more people will use those stacks, and they will get better. The more the merrier, we have too much of a software monoculture anyway.

> The only real advantage I see for unikernels is that because all the hard work is done by the hypervisor they don't have to bother implementing device drivers

This. People continually underestimate the amount of work required to support the hardware ecosystem. This is also why rump kernels (note, not the same term as the unikernel known as Rumprun) are such an achievement, also very much underappreciated.


Well, yes, a lot is device drivers, but not most. You can make the Linux kernel, for example, minus almost all drivers. That usually slims down the kernel image by a few megabytes. More when you drop various other drivers and subsystems you technically no longer need. Though on most modern systems almost 80% of the drivers are modules and won't be loaded if not needed. The diskspace they consume is irrelevant for most intents and purposes (below 100MB on my distro).

> The more the merrier, we have too much of a software monoculture anyway.

Linux and some other kernels allow userspace apps to have their own network stack in userspace, latest kernels allow even larger sections of the networking subsystems to be entirely in userspace.

I think this approach should be favored over a unikernel since it uses the natural x86 privilege seperation between userspace and kernel.


How does access control work? There are plenty of APIs that need to validate access (e.g. ping or raw IP sockets can require sudo). If it's all running in-process it seems like there's no security you can implement, or the security mechanism would end up being super-complex like inspecting the TCP packets (which may solve networking use-cases but doesn't do so well for all the APIs).

*EDIT: Unikernels are neat as replacements for VMs since the abstraction layer can be much less costly/higher performance. They cannot be used as regular applications running on the OS due to how security is implemented.


A few thoughts:

1) (Putting on my black hat) Attackers don't care about bugs or exploits. They care about running their code on your system. Whether that is as simple as mysqldump or wget'ng a monero cryptominer to run on there it all is based on the premise that the monolithic operating system (whose design is ~50 years old and linux is > 27 years old) is explicitly designed to run multiple programs by multiple users. Keep in mind this design pre-dates commercialized virtualization (eg: vmware) and pre-date "cloud" (eg: aws). If we assume that you are already utilizing VMs (and you are if you are on any public cloud and you are in most private on-prem deployments) the VM then becomes your isolation model. Can you still attack the underlying infrastructure? Sure - but if you can root GCE or AWS I'd say we all have some serious thinking to do on the current state of cloud infrastructure. Contrast and compare that to all the ridiculous headlines you see every single day and the fact that every single RCE that is worth doing entails forking/execve a new process. It's one thing to have the instruction pointer - it's quite another to launch a shell that doesn't exist, link your program to libraries that don't exist, as a user that doesn't exist, download new code when you can't.... etc.

2) Not to belabor this point but side-channel attacks affect everyone and Intel has been taking the hard (in terms of market) approach of simply disabling hyper-threading on some of their hardware.

Security is the number one selling point for unikernels imo.


> We usually see wide-spread vulnerabilities in the middleware like OpenSSH

There has been exactly 1 remotely exploitable bug and exactly 1 information leak bug in openssh in more than a decade (I believe the exploit is actually more than a decade old, need to check though), it is by far the most secure daemon ever. It is hard to overstate just how much of an accomplishment that is. NOTHING matches it in security. This is even more impressive given how widespread it's usage it.

> We need authority if we are sharing peripherals.

I think you'll find that large websites ... just don't do this. The only real purpose of multiple processes on such webservers is administration.

> f we are not going to save memory and implement that as a static library ...

The reason you want to do this is that the slowest thing in modern processors is context-switching. Going from user-space to kernel space. If you have, say, a networking stack in user space you can avoid such context switching entirely. The speed gains are enormous, especially for webservers.

Ridiculous performance in networking also enables many more applications that just won't work without. What do you think about cluster-wide disks, for instance ? SQLite like databases that run against Petabyte-sized files, concurrently, safely. You're just not going to do that efficiently by having it done in the OS.


Just to touch on this point - you hit the nail on the head. Microsoft and others have measured this cost to be north of 30%.


Re: 1 - that might be true, but you still probably wouldn't want to co-tenant a sandbox'd process with a total stranger. The overhead of a total Linux kernel is still larger than a unikernel-native application, so if you're a cloud provider, encouraging and using unikernel technology makes a lot of sense.

Re: 2 - this is managed by the hypervisor. SEL4 can be run as a hypervisor and has been formally verified to be correct [1]. Trusting a proven system like SEL4 is a far-cry from trusting Linux's isolation primitives because we can make hard-guarantees about the behavior.

Unikernels have the advantage that they are basically backwards-compatible (you can run any VM on the infrastructure you develop, even if that infrastructure is tuned for unikernel-native applications). With Unikernels, you can achieve VMs that are lighter than containers [2] thereby increasing your customer-per-server ratio.

[1] https://sel4.systems/Info/FAQ/proof.pml [2] http://cnp.neclab.eu/projects/lightvm/lightvm.pdf


I'm not an expert in unikernels by any means, but it's not the case that it's the equivalent of a lightweight Linux distro plus some neat debug tools. I don't think the linked article does the idea justice.

Here [0] is a good discussion of unikernels (TheRaven knows what he's talking about). ctrl-f for "You absolutely can not do the same thing with a conventional *NIX OS"

https://soylentnews.org/comments.pl?sid=11888&cid=296337


This thread was very helpful:

> Explain this to me then: What's the difference, conceptually, between (A) a VM host running a unikernel capable of solving one particular kind of problem, and (B) an OS running on bare metal running a process capable of solving one particular kind of problem?

They are very similar. The main difference is the shape of the OS APIs. In the former scenario, the host provides things that look like CPUs, NICs and things that look like block devices, with no high-level abstractions (e.g. threads, sockets, and filesystems). If these are needed, then the unikernel will link in device libraries that provide them. In contrast, in the later situation the OS will be providing these even when they are not needed and they will add overhead to those code paths.

Additionally, the amount of process state owned by the OS is far greater in the latter case. Things like file descriptors, thread priorities and so on make it far harder to snapshot or migrate an OS process. Hypervisor interfaces tend to be as close to stateless as possible.


> In the former scenario, the host provides things that look like CPUs, NICs and things that look like block devices, with no high-level abstractions (e.g. threads, sockets, and filesystems).

To add to this, with the approach we're taking with Solo5 [1], hardware virtualization is only used as one possible sandbox/isolation mechanism. If you look into the code, you'll find that the abstractions presented to the unikernel by the hvt tender are much thinner than those presented to a guest OS by a traditional VMM (e.g. KVM/QEMU).

[1] https://github.com/Solo5/solo5


I agree. One of the appeals of unikernels in my opinion is that (I think) they will eventually give us the things that Docker currently gives us, but with one less layer of abstraction (unikernels running on a VM orchestrator instead of containers running on a container orchestrator running on a VM orchestrator). I imagine that this is inevitable as a matter of efficiency, but it's probably such a small practical gain that we won't see much traction in the next decade.


> I imagine that this is inevitable as a matter of efficiency, but it's probably such a small practical gain that we won't see much traction in the next decade.

I agree that if the advantages boil down to saving a hundred MB of RAM of two, and enabling additional compiler optimisations, it won't be compelling to that many folks.

Taken to the extreme though, we can start spinning up instances 'just in time' to deal with incoming requests. Doing this, you could run (and hopefully pay for) your instances for only a tiny fraction of the time your service is in a working/available state.

This has apparently already been demonstrated [0], but their demo has been broken for as long as I can remember. I presume it worked at one point. I don't know if any commercial Xen-based offerings can spin up an instance as quickly as their configuration did, but the idea is there.

[0] http://erlangonxen.org/zerg


My VM is lighter and safer than your container:

https://news.ycombinator.com/item?id=15610155


Unfortunately the tooling story is poor for VMs. There isn't a good analog for the Docker toolchain (build, pull, push, etc) or container orchestrators (e.g., Kubernetes). These stories are at least as important as "lighter and safer".


I'd disagree here. The entire public cloud model is built on top of VMs. While I have a strong distaste for most CFM tooling out there that is precisely what you're talking about. Chef, Puppet, Salt Stack, Terraform, etc. all fulfill this and all of these are popular in their own communities.


I don't doubt that there are analogs for individual components, but I don't think there is anything comparable to Kubernetes for VMs. I also doubt the quality of those individual components or their integrations (for example, how do I build a VM image in <1m, what image repo technology should I use, what orchestration technologies exist and how do I tell it how to pull from my image repo, can I run that orchestration tech locally or on prem, etc?). I don't think these are insurmountable problems; there just isn't the same amount of investment (or rather, the investment is comparatively poorly coordinated).


LinuxKit gives build push run for VMs. It was kind if designed to be a half way stage to a full unikernel model.


I've played around with LinuxKit; it's quite difficult and limited. I couldn't get anything off the ground. Promising for sure, but needs more investment before it's ready for general purpose use.


I think container orchestration offerings are pretty much at the pricing model you describe. You can spin them up just in time and only pay for what you use. The advantage to unikernels is that cloud operators don't have to operate two layers of orchestrators, so they could pass those savings onto unikernel users; however, I don't think that will be the lowest hanging fruit for a long time.


I'd argue the opposite, which is that the future will have even more layers of abstraction, not less, since that's been the trend for quite some time.


My prediction is absolutely not "fewer total layers of abstraction", but only that this one will be collapsed in time. You can collapse one layer of abstraction and add two more, which is sufficient for my prediction to be compatible with the "more total layers of abstraction" prediction. :)


Or you could just run containers without the abhorrent mess that is hardware virtualization...


The post discusses running unikernels without "the abhorrent mess that is hardware virtualization".


1st. It’s not going to solve that. But, I don’t think solving that is a goal of this approach.

2nd. In a pure-unikernel approach, applications can’t steal from each other because there is only one application. Isolation is left to the VM. The goal is to get rid of the redundant isolation of running a multiplexing OS over a multiplexing VM. In the as-processes model, the Linux kernel still provides sandboxing. The goal there is to be a bridge between traditional and unikernel development.

3rd. The goal is to boil down to just virtualization and only the libraries specifically needed by a single specific application.


We are already seeing exokernel concepts. On a Linix machine, most 3D graphics is handled in the userspace GL library. The kernel driver handles modesetting, manages memory, and arranges for the userspace library to be able to directly submit command buffers to the GPU via memory mapped areas. DPDK works the same way. The NIC ring buffer is mapped directly into the userspace process, which can then perform packet I/O without calling into the kernel.


Well, that's roughly how it's supposed to work in theory for graphics, but in practice the kernel ends up having to do fairly deep parsing of GPU command buffers for security.

See, for example: https://www.kernel.org/doc/html/v4.11/gpu/i915.html#batchbuf...


Intel's GPUs aren't exactly known for stellar performance, but I'm pretty sure that's old info even for Intel. Certainly this kind of parsing hasn't been required on AMD and Nvidia GPUs' 3D engines for almost ten years now.


There's still a good bit of it necessary even for new GPUs like Radeon R600.

See r600_packet0_check, r600_packet3_check: https://github.com/torvalds/linux/blob/master/drivers/gpu/dr...


Thanks for the very interesting references and discussion. GPUs are hard, huh? AFAIK, this sort of validation isn't necessary for NICs. (Question: why is parsing necessary to ensure security, instead of say relying on the IO-MMU?)


The IO-MMU only provides device-level isolation for accesses to RAM, but you really need process isolation, and not just for accesses to RAM but also for VRAM where the IO-MMU is not in play at all.

NICs are an unfair comparison because the commands for NICs are all generated by the kernel, which is trusted. For GPUs, commands are generated by an untrusted user space driver that translates OpenGL or Vulkan commands into command packets for the GPU. Those command packets contain memory addresses, so in order to achieve isolation between different user space processes in the face of a potentially malicious user space driver, the kernel used to have to validate those addresses in the absence of GPU page tables.

Anyway, this kind of validation really hasn't been necessary for a long time now, despite pcwalton's outdated information, because GPU page tables were phased in around 2010. (The first AMD chip to have them was Cayman in 2010. Since we're citing kernel drivers here, take a look at https://github.com/torvalds/linux/blob/44786880df196a4200c17... and note how the function skips any parsing when virtual memory is used.)


R600 is not a new GPU, don't know how you got that idea :) It was released in 2007.


Except that on Windows the kernel part is quite tiny. Still waiting for X to survive driver crashes.


I would (in theory), love a unikernel for the work I do. Our bare metal boxes run about 20 Unix processes -- cron, syslog, sshd, etc plus one server daemon. It's nice to have a familiar environment, but we could run on fewer boxes without a general purpose kernel and general purpose memory protection that we don't strictly need, especially for nodes that could easily be tweaked to forgo local disk access (because filesystems are tricky, I'd rather leave nonephemeral storage with something proven)

That said, we're not moving in the direction of unikernels, but it has a clear application in my mind. It's a fair bit of work to migrate, and I've yet to see a practical comparison, so it's unclear what the actual benefits over conceptual benefits would be.


What is the problem domain?


Chat service, in Erlang. I'd love to just boot to Erlang, and have as much of the system as possible be hotloadable, including the tcp stack. But I may be suffering from too much exposure to hot loading. :)


Sounds like a great problem to work on! When you have such a well defined problem it can be a great ticket to do good work since in some ways you can control the stack top to bottom. Have fun!


I'm not actually working on it, unfortunately. There's too much risk, not enough time, and we're in the middle of migrating to a different environment where it would be an even weirder thing. But I think it's a good example of a place it could work, and it would be a lot of fun.


That's a shame, because Erlang as a unikernel already exists (as https://github.com/cloudozer/ling, unfortunately unmaintained). I'd be really keen on seeing an Erlang port to the Solo5 API, which would give you KVM/bhyve/OpenBSD vmm/Muen/others to run on.


There's some evidence of it running with rumprun (which appeals more to me than Ling -- I want to run the standard BEAM, and I want to run it on bare metal, because I can fill multiple machines and don't want to exchange an OS for a hypervisor).

https://github.com/rumpkernel/rumprun-packages/tree/master/e...

http://www.erlang-factory.com/static/upload/media/1474729921...

I'm not sure if there's any news on it since then.


> We usually see wide-spread vulnerabilities in the middleware like OpenSSH

> Correct me if I'm wrong.

OpenSSH is probably the single most secure piece of C network software. It isn't exactly a good example for "wide-spread vulnerabilities in middleware".


Unikernel is about install controls to application writer.

Once developers get into the workflow of writing applixations that can have fuller control of its environment, rarely one would return back to the world of opaque infrastructure and framework.


I don't quite understand the title of this article. Where is the part that explains why it's no longer an academic exercise? Nothing has been shown to now exist at a scale that makes it any more than another toy (in the academic computer science meaning of the word).


I was personally an unikernel conference in Beijing in May. Baidu, Alibaba, VMWare, ARM and many other multi-billion dollar companies were talking about how they are rolling in support into things like their serverless infrastructure.

Also - there are production deployments out there.


VM support for unikernels is literally a prerequisite for unikernel uptake, so that's hardly surprising, but coming from academia: if a highly niche market using a thing still keeps it a mostly academic exercise, and this article does not contain anything that talks about how it has finally escaped the "mostly interesting to academics, and that handful of people who use it in their highly specialized or possibly instead just plain old 'someone once set it up and it's too critical to change now' systems".


Microsoft used their unikernel research (Drawbridge) to bring SQL Server into Linux.

https://arstechnica.com/information-technology/2016/12/how-a...

It is also how they kind of implement secure kernel in the recent versions of Windows 10.


>Baidu, Alibaba, VMWare, ARM

So in the server less future, Linux and FreeBSD will no longer be at the heart of these "Functions"? But instead they all migrate to Unikernel? At the scale they are operating I am pretty sure it make sense, if it brings 5 - 10% performance improvement along with other benefits.

Not sure if I like the way things are moving in that direction though.


In a way, POSIX is the C language runtime, and other languages have their own runtimes as well, so at the limit whatever kernel your application is running one doesn't matter at all.

Then there are those few use cases where one actually needs to dive into OS specific syscalls or Assembly.

So for a large class of applications it doesn't matter if the application is running bare metal, on a VM, container or plain old OS process.


A unikernel conference? I didn't know there was such a thing.


No, we didn't see you there. :-)

It was a more of a workshop in relation to ICS2018.


An example would be MS SQL Server on Linux, it basically makes use of library OS to run Win32 on Linux.

https://arstechnica.com/information-technology/2016/12/how-a...


I don't see the value in unikernel-as-a-process. Most of the syscalls save time by e.g. sending a bulk payload and having the kernel space TCP/IP implementation break that up into fragments and handle sending them to the interface. Moving the barrier halfway down now means you did all of the work and got negative performance benefits - seemingly under the guise of security... but isn't the whole point of unikernels to run hardware assisted VMs as your segmentation barrier instead of relying on software ferrying calls and data between hardware trust levels?


The value lies in security in this context. You also get a bit of performance as the "unikernels" are self-contained. So you can do most things, including the IP stack without switching into kernel mode.

I'm guessing this is mostly interesting for FaaS platforms or similar. You get isolation similar to hardware-assisted vms but with a lot less overhead and with phenomenal boot-times.


While it is possible to do, there is an important cost benefit analysis to make. All of the benefits the article lists can be a drawback as well. Say your specific network needs tweaking to get good performance, with a unikernel you must do this with all applications, while for a "traditional" monokernel you can do it once and it works for all applications.

That is not to say the approach is not interesting, but it'll probably be for in some kind virtualization, where you abstract away the OS entirely, and run applications directly on the hypervisor.


I am not sure the example makes sense at all.

Default settings always not working for all apps.

If you need to tune network, if that's a tuning for all apps, development can be done inside net lib, all apps rebuilds and release, that's not more expensive than tuning a kernel config.

If you need to tune on per app basis, then unikernel is wildly safer because of the isolation.


Not everything is open source.


Isn't that make my statement more true?


Single Root IO Virtualization (SR-IOV) has the potential to enable this. If the number of virtual functions climbs high enough for the number of process, or at least the number of critical processes, then you can just map a virtual function into all of the necessary processes. DPDK and SPDK already partially enable this but the number of virtual functions or complete lack there of for NVME devices is still a limiting factor.


What are the best tools for experimenting with unikernels?


I believe it comes down to your programing language preference. If you are into OCaml the Mirage Unikernel is rather complete. If C or C++ is your thing there is IncludeOS (potentially adding other language runtimes next year) and for Haskell there is HalVM.

There are others as well, but AFAIK these are the ones with active development happening. Please correct me if there are active ones I've forgotten.

All three support compiling your application as a Linux binary, meaning you can do most of the development and debugging using the tools you're used to, IDE with visual debuggers and the whole shebang.

Once you want to run in a separate VM you can still debug, but it becomes a tad bit harder. I know how to debug IncludeOS application when they are running under Qemu or KVM. Qemu can act as a gdb remote, you just need to get it working, which is a bit of pain.

And now you have the option of running it the way it is describes in the paper. I've never touched this so I don't have a feeling for how hard it is getting it to run.


Rumprun[0] is another big one in the C camp. Though I think you can use basically any software that conforms to posix.

[0] https://github.com/rumpkernel/rumprun/


No commits since April. I think Antti has left the project and started a brewery or something.


Shame https://github.com/cloudozer/ling seems to have stalled.


The Xen source tree contains an example of a minimal unikernel. Probably worth looking at the source to it even if you end up using something with more features.


You can already.

Just use Linux or something similar and use the lower APIs.

E.g. don't mount a filesystem, use the block device directly in your app.


Wouldn't that add a ton of overhead, if nothing is done in kernelspace and everything gets squeezed through syscalls


For the counter argument, see "Unikernels are unfit for production" January 22, 2016 - by Bryan Cantrill

https://www.joyent.com/blog/unikernels-are-unfit-for-product...


I like this essay, but I think it's really pointing to a narrow space, rather than a lack of space.

Bryan points out many weaknesses of unikernels, but assumes everybody needs multi-tenancy and the ability to spawn processes. And debugability is weak, but that probably depends on the application / environment you run. A lot of people want to run unikernels in a VM environment, but for me, it seems like the right application of unikernels is where you have one application that you want to expand to fill a single machine -- bare metal, boot to the application, save all the layers; there's no multi-tenancy, but if I'm running on hundreds/thousands of machines, I don't need multi-tenancy.

Debugability is important, but lots of people run without a kernel debugger, so it might not be that important to everyone. If you want it, you'll have to build it, but DTrace and friends had to be built too -- and it's easier to build it a second time, since you know it's possible and what it should look like.


And indeed, we have built some tooling for debugging, since we wanted it: https://github.com/Solo5/solo5/blob/master/docs/debugging.md. It wasn't that hard.


I would take the opinion of someone that has a piece in the game of selling OSes with a grain of salt.


Can a in kernel Webserver be understood as unikernel?

I.e. https://en.m.wikipedia.org/wiki/TUX_web_server


Sure. As long as you disable init on the server you'll have a unikernel setup. Round hole, square peg, though. Linux isn't meant to work like that.


I still think it is. There's still been no push in the open source community to make tooling around Unikernels to make it user to a broader audience and a broader set of use cases.



What advantages do unikernels have over microkernels?


Unikernels are older than microkernels, in terms of design philosophy, and owe more to VM (the IBM project from the 1960s) than later OS design.

Microkernels are about splitting a single OS into what are sometimes called 'servers', whereas unikernels are individual systems running on a hypervisor; ultimately, the biggest advantage is that, in a microkernel, if one 'server' goes down (the disk server, for example) the whole system is down, whereas a single unikernel can fail without impacting any other unikernel.

https://jdebp.eu/FGA/microkernel-conceptual-problems.html

https://utcc.utoronto.ca/~cks/space/blog/tech/HypervisorVsMi...


> the biggest advantage is that, in a microkernel, if one 'server' goes down (the disk server, for example) the whole system is down

Rather, if the disk server goes down, the whole system might go down. It doesn't necessarily will go down. You could, for example, simply restart it. If your Graphics server goes down, you might still be able to SSH to the machine, and in the example of yours which I quote you could run a remote emergency SSH server in full RAM.

Unikernel and microkernel are good examples of (fine-grained) principle of least privilege. Doing that right is difficult. Just ask the NSA or Intel.


Realistically, if a disk server goes down, there are two possible reasons:

1. The disk hardware is bad. By all means, take everything down until you replace the disk so you don't get garbage in important files!

2. The disk driver software is bad. By all means, take everything down until you replace the software so you don't get garbage in important files!

In neither case is simply bouncing the disk server an acceptable answer. Debugging and fixing the underlying problem is essential, and that requires taking the whole system down.


The disk server could conceivably fail because a disk took an absurdly long time to respond to a command --- by all means, update the firmware, but in the mean time, restarting the disk server (and replaying any pending writes, somehow) will get you back on track. At least restarting the disk server in read only mode allows reads to continue to function -- maybe even allows for an orderly evacuation of the data.


Good thinking. Read-only mode is an interesting suggestion. Filesystems allow this as well, and it can allow for the live extraction of data having to resort to professional recovery means (ie. physical and expensive).

I've had times with bad sectors (Deathstar at the very least) where reads would result in complete OS lock up or very laggy situation. That can be solved by only trying so many times to read (though that was on Windows 9x with PATA).


The question is rather: would a unikernel or monolithic kernel have behaved differently?

Probably #1 (or a nefarious person being root), but you do have backups and redundant servers, right? I'd read SMART logs first. Back when my Death Star died some 20 years ago I remember it getting more difficult by the day to spin it up. But once it worked, it'd work fine.. until the machine tried to read from a bad sector (a good OS wouldn't have frozen on that but this was Windows 9x).

Here are some alternative possibilities:

3. Physical compromise; e.g. a cable got yanked.

4. Disable write caching and continue despite the driver or hardware being bad.

5. There is no quick way to get physical access.

6. A disk server isn't necessary to keep the system up. Goes well with #4.

If you look at how it all went, the massive redundancy of cheap commodity hardware won. Containers add further on that, and no doubt more bloat can be removed via a unikernel or microkernel (leading to smaller attack service but also more complexity). But that doesn't mean every aspect of such a machine needs to keep running, and it does imply a FOSS solution.


The operating system universe is a lot bigger than most people have been led to believe. They think it's windows vs linux vs osx. If you're 'wild' you go *BSD.

There's a lot more out there - unikernels in particular are similar in nature to RTOS (like the system that is on the Mars Rover) and they definitely descend from the microkernel branch of the tree.

Having said that many microkernels tend to be multi-process. There are now over 10 different unikernel implementations out there currently. They all have different aims and goals but I'd say one defining characteristic amongst all of them is that they are single process by design. There's many other considerations but that's the most important one that comes to my mind.


Ok, so IIUC, that could be especially useful if I were to go quasi-embedded way, meaning at least a "fully dedicated server" (for the one and only Process ...princess?), as a unikernel might then let me get closer to the metal with little to no effort on my side. Amirite? Thanks!


We're talking about pretty significant improvements in network I/O performance (among other things) with a unikernel, correct ? My understanding is that there have been several projects that have revolved around putting Linux network drivers into userspace so as to improve the throughput and avoid the kernel mode switch.


Here are some presentations from Google touching on reducing kernel overhead for high performance networking: https://netdevconf.org/1.2/slides/oct6/03_jerry_chu_usTCP_LK... https://www.usenix.org/sites/default/files/conference/protec...


Beautiful, thanks for the links.


It might be interesting, at least as an exercise, to combine unikernels with the split kernels from distributed "lego os". Having a slimmed down only what is used footprint spread out across the network could lead to multiplied advantages.


After a presentation I saw at ToorCon earlier this year, I don't trust unikernels in general. Several of them are beyond bad from a security standpoint with little to no protections from malicious software.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: