Hacker News new | past | comments | ask | show | jobs | submit login
The Helios microkernel (drewdevault.com)
249 points by goranmoomin on June 16, 2022 | hide | past | favorite | 141 comments



The post mentiones seL4 as an inspiration:

> Again, much of the design comes from seL4, but unlike seL4, we intend to build upon this kernel and develop a userspace as well.

I'm wondering why they didn't decide to build the rest of the system on top of that.

I'd totally respect answers like "because building kernels is fun" or "because I wanted to do that in my programming language". I'm just wondering if there was another reason seL4 was unsuitable.


seL4 has an interesting API design but I'm not fond of the internals or implementation. It's largely an academic exercise, in my opinion. They also have no ambitions in userspace, and limit themselves to working on the kernel alone, which is not a great idea - other people have criticized seL4 on the basis that not building a system with your kernel is a good way to be blind to the design flaws of your kernel.

In short, seL4 is cool, but not very practical, and I think it can be done better.


A few thoughts:

- the seL4 team definitely works on user space stuff (see the rest of the repos in the seL4 organisation on GitHub), but it's very basic thus far. there is some design work going on to standardise some user space bits and pieces. but i don't think they're ever going to build an top-to-bottom opinionated operating system stack – that's for other people to do.

- it's not a solely academic exercise, it is being used in industry, but most of what everyone sees is the output of the research group, which gives it an academic sheen.

- the unfamiliar internals and implementation are driven by a fundamental difference between seL4 and almost all other kernels... it is verified.


> it's not a solely academic exercise, it is being used in industry

Yeah, e.g. General Dynamics bought OK Labs. *waggles eyebrows*


The "academic exercise" you are talking about has been used and is being used in space, defense, and consumer mobile devices. Very few micro kernels have had that amount of practical real-world usage.


AFAIK l4se focussed on verifiability and simultaneous performance (with actual customers in defense I guess). L4RE seems more focussed on application (one being the simk3 'merkle phone') . So I wonder which part of inspiration comes from seL4 and which part from L4 in general . Especially can you elaborate the shortcomings of the general design in terms of user land (I only had to build a toy OS on l4 in a practical course more than 15 years ago at liedke's OS group just after he passed away so suddenly. We obviously never got to the nasty parts.)


The actual API differences between Helios and seL4 are somewhat subtle, rather than a broader difference of design philosophy. For instance, the layout of the bootinfo structure is different - PML4/PDPT/PD/PT objects are stored in separate capability regions so that userspace can more reliably distinguish them without architecture-aware heuristics. Little things like that. And of course, the implementation is wildly different from seL4.

To be clear, I am specific about the "seL4" inspiration rather than the "L4". Naturally L4 plays a role, but Helios's API design has a pretty obvious influence from seL4 specifically.


There is https://genode.org/ which builds on top of it, as one of a few similar options. Look for Sculpt-OS there.


As someone who worked with seL4, it's because the development experience is literally the worst you can possible have. It's zero fun and anyone who has to deal with it inevitably ends up hating seL4. It exists to brag about verification, developing real applications on it is a fool's errand.


> It exists to brag about verification

That is pretty much my own experience with sel4. Not very useful for real world


That's interesting, do you think the poor DX is due to not enough effort being made to make it more useful for outside devs, or because making it possible to do verification makes other things ugly and inconvenient?

I was considering seL4 for a hobby project (having worked with L4Ka:Pistachio a long time ago), but didn't dig deep enough to spot these problems.


The latter, in the sense that it has been scaled down significantly to make verification possible.

As soon as you start adding stuff or running it in a different hardware the verification will no longer apply, so why bother?


That's interesting, could you provide more details? What are the un-fun bits? Too restricted? Too slow? Too hard to debug the application?


I had the same concerns when Fuchsia was first announced. It seems ... almost reckless... to build a new microkernel from scratch in this day and age when sel4 (or just L4 generally) exists and is so heavily tested and developed.

All of these people are frankly better engineers than me, of course, so my opinion doesn't matter here, really. But I do have a worry about people in our industry seeking novelty instead of building on the proven.

(Except of course for research or hobby projects. Go for it.)


The thing is seL4 had (has?) many limitations which made it not very interesting for the real world: at some point it wasn't able to use multiple cores, or use low power states.. I think that only people who have closely looked at seL4 can tell whether it's really usable for their use case.


Linux started out as a hobby project.


Sure, and I used it back then. There were kind of no other realistic options with momentum behind them. But also we just accepted that it was a bit cowboy and that was part of the fun. I guess it could have been FreeBSD or even Minix that got the momentum maybe, but Linux caught on with enthusiasts, it made the right compromises and then big names like IBM put weight behind it.


I think that the 'big names' that threw their weight behind it mostly did so because they realized that (1) they'd never manage to get a community like that going and (2) because it did not seem to compete with their core business.

SUN, DEC, SGI and many others besides that aimed to dominate the server market were left by the wayside, and even Microsoft was running (very) scared. If not for their illegal behavior they too would have lost the server market.


Yeah it was a crazy time to watch the whole bottom of the workstation market fall out. Solaris managed to hold on in datacentres for a while, but the others not as much. And a 486 or Pentium running Linux was just as good as most proprietary Unix at the time, or was well on its way to getting there.

I did a summer contract job at an IBM subsidiary in summer 98 or so, and I had a manager there fight me on installing Linux on some spare PCs, because he thought they should be running, y'know, SCO. A Real Unix. It already looked preposterous by then.


One of those SUN pizza boxes held out so long that the bearings on the drives were completely worn out, everything was moved onto a different machine and when it was powered down you could hear the harddrives have a most horrible crash. The drives would not even spin when powered up again.


I have some concerns about the future of the OS. It seems that the author in involved in many other ambitious projects (sway, hare, sourcehut...). I am not sure how sustainable it is to keep up with all of them. Anyway congrats for the milestone and best of luck moving forward!!


It can be sustainable when you hand over projects if you feel you accomplished everything you set out to do: https://drewdevault.com/2020/10/23/Im-handing-wlroots-and-sw...


From what I've experienced from (very limited) involvement in one of those projects is that he puts a lot of effort into onboarding and investing in new contributors.



Not to be confused with the other Helios OS - https://github.com/axelmuhr/Helios-NG


Good point. I wrote about that recently:

https://www.theregister.com/2021/12/06/heliosng/

Very interesting OS that is FOSS now. By v3 it ran on multiple CPU architectures. With some of the manycore CPUs appearing in recent years, this is crying out for a revival.


Yes. It'd be a shame if this effort were to continue to use the same name, especially given they're both operating systems.

Drew?


The name of the complete system is called Ares, and Helios is a small component of that. Essentially, to most people, it's an implementation detail/internal codename that will not really see much use for marketing efforts.

It does not look like the other Helios project is going anywhere, so I'm not too concerned with the naming conflict right now.


The other "Helios" project was a commercial operating system from Perihelion Software, which was subsequently kinda open sourced after the company went bust (which is the GitHub repo posted).

There are a couple of books about it from Prentice-Hall: ISBN 0-13-381237-5 and ISBN 0-13-386004-3, plus various academic papers, commercial software for the OS (mostly development tools), open source software (X11, gcc, other stuff), etc, etc.

It was the default operating system of the Atari Abaq/ATW (https://en.wikipedia.org/wiki/Atari_Transputer_Workstation), and also ran on various other Transputer, i860, and various other early parallel computer systems.

It too was a micro-kernel, with a very Plan9-like global name system implemented by applications plugging into the name server protocol.

If nothing else, it might be of interest to read up on it, but given the similarities between the projects, it seems to be an unfortunate naming collision.


Yeah, pretty rude to stomp the name.


> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

> Please don't use Hacker News for political or ideological battle. It tramples curiosity.

"What, this comment doesn't do either of those?". Taken in the context of your other comments in this section, it does.


Seems like a lot of alternative OSs these days are microkernels. Fuchsia/Zircon, Redox, and now this. Seems pretty interesting. I wonder if the Linux kernel will take cues from these developments and start shoving auxiliary device drivers into userspace? A pleb can dream.


The monolithic kernel has lost, only the FOSS UNIX clones haven't yet catch up to it.

OS X and Windows always had a hybrid architecture since the beginning, and with the push for user space drivers for third parties even more so.

Most commercial RTOS are microkernel based, like QNX and INTEGRITY.

Even Android imposes a fake microkernel like architecture since Project Treble, where since Android 8 classical Linux drivers are considered "legacy".

Then there is the whole irony that those that strongly argue for monolithic kennels, nowadays run Linux on top of a type 1 hypervisor, full of containers.


> OS X and Windows always had a hybrid architecture since the beginning

"Hybrid architecture" is a PR exercise. These are monolithic designs with a better marketing team.

Imagine if somebody tells you their new programming language has a "hybrid" numeric type. It has all the advantages of floating point numbers, yet none of the disadvantages of the integer types. Wow!

Wait, hang on, let's look a bit more closely, those are floats. They've just described floats, but used some semantic sleight of hand to claim they're a "hybrid" numeric type.


Some like to call it a PR exercise.

Others enjoy that there is some effort in place to have RPC comunnication between kernel modules instead of direct function calls, a hard push for third party drivers only to exist in userspace and a standard ABI for extensions.

I rather see the progress side, even if partial, than hold to legacy designs.

See I could also call a PR exercise to safe languages that depend on using unsafe everywhere on their lower layers, maybe they aren't safe after all.

And yet we know that isn't the case.


> See I could also call a PR exercise to safe languages that depend on using unsafe everywhere on their lower layers, maybe they aren't safe after all.

Although it's easier to look at what's technically different in a language like Rust, probably more important is a philosophical difference and the resulting community.

Let's use an example I gave recently to my line manager (who is technical but not an excellent programmer): Sort stability. C++ says, here's sort function, and in the documentation by the way this is unstable, if you need a stable sort that's named stable_sort(); Rust says, here's a sort function, if you know enough to know you don't need a stable sort, there's an unstable_sort() function which is faster [and exists in tiny embedded systems since it doesn't need an allocator]. Unstable sort isn't unsafe (well, it is in C++ but that's C++ for you, it's not a fact about unstable sorting) but philosophically it violates the principle of least surprise as your default sort() function, and that would be a foot gun.

Go has an "unsafe" library with the dangerous stuff in it. But unlike Rust it doesn't have the safe philosophy and so e.g. as we saw recently Go doesn't feel the need to even warn you that things aren't thread safe and some Go proponents think you're crazy if you expected anything to exhibit thread safety.

If I saw a clear RPC philosophy in an OS like Windows I'd mostly go along with what you're saying. If when I looked at Windows I saw a bunch of loosely coupled systems communicating using a well defined protocol, I'm on board that's really not just a monolithic kernel. But what I see instead is mostly layers smeared on top of a monolithic kernel, some of which pretend to be RPC but are really just doing a syscall to some kernel code where the work happens. The language of RPC is used in some places, but the philosophy is largely absent.


Unless you have mastered all editions of Windows Internals, MAC OS X Internals: A Systems Approach, MacOS and iOS Internals, Mac OS X and iOS Internals: To the Apple's Core books.

Or having been a former Microsoft or Apple employee working on kernel stuff, then I could eventually value the opinion how fake they happen to be.

Yet somehow I get the feeling, from both of us, only I have delved into such internals.


Seems to me that the notion suppose to be "Hybrid is basically an inferior monolithic", after all, if most functions (even without drivers) were still in the kernel, wouldn't it be just another monolithic?


Ah, I knew I would get one of them wrong, and then didn't go back to check, so

C++ indeed has std::ranges::sort() and std::ranges::stable_sort()

However Rust's [T].sort() is paralleled by [T].sort_unstable()

That is, I just got the Rust names in the wrong order. The orthography is different because how these work under the hood is very different, but that's not important here.


> using unsafe everywhere

But this is OK, as long as Unsafe is a monad!


> "Hybrid architecture" is a PR exercise. These are monolithic designs with a better marketing team.

This is similar to what I heard from other people about NT. IIUC, NT was called hybrid because it allowed something similar to loadable modules.


Nope, NT is hybrid, because many drivers exist in userspace, and at kernel level RPC are used to call across kernel modules, instead of direct calls.

https://en.m.wikipedia.org/wiki/Local_Inter-Process_Communic...

Additionally, since Window 10, all critical modules are sandboxed, and running under hypervisor control when hardware support is present.

https://www.microsoft.com/security/blog/2020/07/08/introduci...

This is something that Windows 11 requires, secure kernel hardware requirements aren't optional.

Calling it a marketing exercise only reveals a lack of understanding about the subject, and possibly some bias as well.


> Calling it a marketing exercise only reveals a lack of understanding about the subject, and possibly some bias as well.

Definitely my case.


> I wonder if the Linux kernel will take cues from these developments and start shoving auxiliary device drivers into userspace?

So far, you can write drivers for USB devices (https://libusb.info/), simple I/O devices (https://www.kernel.org/doc/html/v5.14/driver-api/uio-howto.h...), and filesystems (FUSE) in userspace. There are probably more, but those are the ones I know about off the top of my head.


You can do also most of networking in userspace using XDP/AF-XDP (https://www.kernel.org/doc/html/latest/networking/af_xdp.htm...). The driver will only forward raw IP packets to userspace and everything else (like e.g. a TCP stack) resides there.


Does that cause a ctx switch or any other overhead? What is the real world performance like relative to in-kernel networking?


It basically provides all packets in memory mapped space directly to userspace, and the NIC can populate that memory. There’s no context Switches involved unless you want them (eg wait for packets to arrive).

The performance really depends on your application. There’s a lot in the kernel network stack (eg iptables/nftables) that not every application needs and that might be costly. But then again for usual TCP based applications with larger dataflows the kernel path is also already decently optimized with hardware offloads - and running userspace networking stacks comes with its own issues.

If networking functions like sendmsg make up > 30% of an applications profile it can be worthwhile to investigate it.


Linus is not a great believer in the microkernel architecture. Quite apart from that while certain things like FS can be shifted to userspace through FUSE, a ground-up rearchitecting into a microkernel may be extremely difficult for Linux at this stage. It would be more or less a rewrite of most of the core kernel I guess... and then the question of changes to the syscalls, which also Linus doesn't encourage.


Generally his skepticism was driven by 2 things around the time the microkernel debate happened:

- Microkernels need more development effort for equal functionality, something that was a constraint when he was doing Linux on his own.

- The context switching penalties on 386s were ginormous - it's much more reasonable now, I'd even say it's not much more penalizing than thread switching.

- We live in a multi-master world with IOMMU - tons of devices have virtual address spaces on their own, with most of them doing DMA transfers.


What Linus believes in or does not believe is hardly relevant though, what matters is what is the best solution going forward. Sunk costs are always smaller than the future costs.


Linus was not a believer something like 20 years ago. I wouldn't be surprised if his stance has significantly softened.


FUSE for filesystems already exists in userspace, so its not unheard of. Just making it work safely and secure is the first steps to having these kinds of features upstreamed.


You can already build auxiliary drivers / codecs / etc and insert them into modules, then put together a device tree overlay to use the module. With things like dt overlay fs, straight forward to insert new device trees.

If you really wanted, could build a minimal kernel and move many devices to be inserted at runtime.

Anyway, wouldn’t mind if all that was made a lot smarter and made more accessible to people.


That's not a microkernel architecture; that's a monolithic kernel with modules. (Just like Linux.)


I’m talking about linux responding to the comment about it.


Isn't that how Raspberry Pi does things?


It is.


Judging by the quality of the Fuchsia code posted on HN recently, it looks like a complete dead end.


Yeah I have mixed feelings. I'm glad somebody is out there trying to make a ukernel work for real end user use cases. But until multiple vendors adopt and contribute to the project, it will never gain the momentum of Linux.


It has worked out alright for Android, I bet you would assert the same.


> Fuchsia/Zircon, Redox, and now this.

Don't forget the original (for Linux anyway), SEL4.

https://sel4.systems


seL4 has nothing to do with Linux, aside from the fact that the team was involved in another Linux-on-top-of-L4 project (L4Linux).

Also, seL4 comes from a long line of L4 microkernels (https://en.wikipedia.org/wiki/L4_microkernel_family#History) which actually predates Linux (L3 was developed back in '88).


Drew has said [0, 1] that it's not possible/practical to build a browser today (enough counterexamples exist for this to be not really true) but the way his projects are going I kind of expect him to announce one in the next few years.

[0] https://drewdevault.com/2020/08/13/Web-browsers-need-to-stop...

[1] https://drewdevault.com/2020/03/18/Reckless-limitless-scope....


NetSurf[1][2] still exists, and while it's not a small program, it's definitely not as large and unwieldy as Chrome or Firefox. And it's good-enough to browse simple sites like Wikipedia or HN with, heh.

[1]: https://www.netsurf-browser.org/

[2]: https://en.wikipedia.org/wiki/NetSurf


I used Dillo for a time. NetSurf is now my browser of choice for constrained systems. I sometimes use it on Termux.


The SerenityOS team are building a browser from scratch and it is even passing some of the ACID tests. The pretty significantly along. So it is possible with enough dedication, but whether you can be competitive with Chrome is a different matter.


Yeah and is more work then the OS itself...

But hey having a browser and ext2 should be good enough ;)


Their browser needs to run on a single platform, their own platform, and only one rendering backend, etc. Thus it is a lot less complex than Chrome or Firefox.


I think that supporting multiple platforms is one of the smallest challenges for a browser. CSS and performant JavaScript that manipulates DOM is far more complex a problem.


> expect him to announce one in the next few years.

9 months ago:

https://drewdevault.com/2021/09/11/visurf-announcement.html


Days.


https://drewdevault.com/2022/05/30/bleh.html

Must feel nice to have an article written about you!

Given your karma, I assume you already know this, but from the guidelines:

> Be kind. Don't be snarky. Have curious conversation; don't cross-examine. Please don't fulminate. Please don't sneer, including at the rest of the community.


There should be a guideline for following one username around a thread and quoting guidelines at them across multiple comments because you don’t like them. Your solution is quite noticeably worse than the problem and fouled up the thread a bit more than the user you’re chastising, which is quite ironic.


Kagi is building Orion on top of WebKit and based on my testing it is already very solid even though it's extremely early in the development

EDIT: Well, ignore, I didn’t realize we are discussing from scratch browsers :)


There are good counterexamples but that is not a good one because Drew was and I am talking about browsers from scratch.


Would be nice to document your progress through youtube the way Andreas Kling did for SerenityOS. It's probably part of the big success and the huge followers and developers he gained along the way.

It's probably an additional huge workload for the production of the video though...


I am amazed how Andreas manages to do very solid work while also turning (at least some pieces of) it into impressive one-hour shows with clear outcomes.

Is he rehearsing his coding videos? Does he pick moments that went well? Is he a superhuman coder?


how does this guy find time to support his business, sourcehut, and doing cool side projects like Hare and Helios? Does he have life? Does he code extremely fast? Has he automated everything in his life, including lawn mowing, so that he can dedicate 25h a day to coding? What's the secret?


I know guys like this and its an amazing amount of focus on a task punctuated by hiking and other outdoorsy/physical activities. They also have no responsibilities outside of their immediate life so they remain uninterrupted. All they do is pay bills and live. But they are also quite solitary yet have healthy social lives.


I wish all luck for Drew and all people contributing to Helios.

I wish it could be a successful OS and maybe some day we can replace Linux for it, since now Linux is controlled by big corps.


is this a serious project or just a demonstration of the language's capabilities? If serious, would be nice to see some Plan 9 type stuff if on roadmap


It's likely at least somewhat serious.

I know Drew is a fan of Plan 9, so that may not be out of the picture - though it looks like there's already a non-negligible amount of stuff planned that may or may not be gotten to.


Would love to contribute as long as it's going somewhere.


You know it is coded in Hare, right?

So, not. Amusement value only. (Amusement suffices, sometimes.)


Explaining my downvote: even if that's true it's considered mean-spirited and naive to say so because you really can't know the future. I don't need to remind you, as an example, Linux was a student project no one would have thought to succeed as it has.


My assessment is based on the very scattered attention in evidence.

A new language is a full-time project, if you are at all serious. A new kernel is another full-time project, if you are at all serious. A new kernel coded in an absolutely immature language starts it out at a major handicap, even if you were serious.

Nobody is obliged to be serious, of course, on a hobby project, but there is no hiding the fact. Wishing doesn't make up the difference. Taking the question at face value, it deserved an honest answer, even at risk of resentment in, well, the peanut gallery -- which delivered.


Then Jonathan Blow must be especially screwed, considering he’s been developing his Sokoban game in tandem with his Jai language. Those projects are far more more complex than Helios and Hare respectively and are being worked on by a much smaller team.


I have already spent over 3 years on Hare and it has nearly 50 people working on it. It is also much simpler in design and smaller in scope than other new languages. Furthermore, building a kernel with it helps to validate the language design, so working on one improves the other.

You are the peanut gallery, by the way.


You reveal more than you probably intend.


The symbiosis of C and Unix is likely why we're not using Multics, VMS, TOPS, or something to this day.


All I know about you is from your comments here on HN, and based on that I respect you, and that's where I'm coming from when I say this: whatever your beef with Drew DeVault, you're embarrassing yourself in this thread. C'mon, you're better than this.


Thank you, I will stay out.


> Be kind. Don't be snarky. Have curious conversation; don't cross-examine. Please don't fulminate. Please don't sneer, including at the rest of the community.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.


Would you mind just downvoting them, and not keep spamming the list with this repetitive stuff? Once is enough.


You're correct of course. I apologize, I got https://xkcd.com/386/'d like an idiot.

Waited until this thread is hopefully quieter to post this.


The planned userspace on top of Helios is going to take a lot of inspiration from Plan 9.


C was created to write Unix, and now here we have Hare to write Helios. Good luck on this great journey Mr Drew.


C was born around UNIX v3 and used to make it portable.


The breadth & depth of things Drew creates is both inspiring and confusing.

Inspiring because he's clearly chasing his passion.

Confusing because it comes across as a lack of focus.


Why is that confusing? I don't get it. Drew has lots of ideas, and he's sharing as fast as possible. Focusing on one to the exclusion of the others would doom a lot of those ideas to be only only by him.


When writing your OS from scratch, what order did you develop your components in?

Did you start with an assumption of UEFI, long mode only?


Helios currently uses legacy boot. The first thing is to get something to boot and print out "hello world" to serial or some such like. I also wanted to get into long mode fast, because Hare does not support 32-bit targets. Then paging is generally wise. The rest is dependent on your kernel design.


It’s really depressing to me that there are people on this site who spend their time on this earth sneering at other people’s passion projects in thread after thread because of the language being used. People notice that and it affects their decision to share their work.


What sneering are you referring to? I can't see any in this thread


Likely referring to this post: https://drewdevault.com/2022/05/30/bleh.html


I’m not sure how to link a comment on this site but it’s now downvoted in this thread.


If you click the timestamp you can get a direct link.

The comment in question is here: https://news.ycombinator.com/item?id=31764243


Another thing not specifically mentioned in the post is this project will be a good workout for the brand new language (Hare) as well. I remember a post on it not too long back on HN that garnered a lot of scathing remarks. Good luck to the authors.

However I just want to add a personal comment. In my view, programs will come and go. They can be written and rewritten. But what is really important if we want to ensure free computing for all is not gradually eroded is to work on as many open standards for interoperability as possible to serve as alternatives for proprietary protocols. Also open, community developed languages. However the hardware situation is out of the hands of software developers, yet critically important. Open software and protocols on closed hardware is unsustainable. Lets hope someone like Musk takes the initiative in this area.


> Lets hope someone like Musk takes the initiative in [open standards for interoperability].

What?


I'd also just like to say "What?" in fact "WTAF?" Don't invoke the M*k word in a serious discussion.


Synonymous with grift.


as many open standards for interoperability as possible

I don't think that's a good idea. There's the xkcd 927 concern, since there are already long-established standards for a lot of things; and that brings me to what I think we really need, and that is stable standards. The longer a standard has been around unchanged, the more implementations will arise. On the other hand, the browser situation is a great example of what happens when some entity takes control of an "open" standard and then churns it endlessly. I think this is particularly true of programming languages, since they are foundational for everything else.


While stability is very important, openness is still no less important.

A number of standards is still patent-encumbered. Back in the day the algorithm to encode a GIF was patented, do you remember that? Do you remember how long Oracle and Google spent in court debating whether it's acceptable to describe and implement public interfaces?

A world where you can but may not interoperate is very inhospitable, no matter how stable the closed standards are.


How does this compare to Fuchsia?


Helios author here. I tried to think of a nice spin for this comment, but came up short. If there is a future of computing with Fuschia in it, it will have everything to do with Google's industry weight and nothing to do with its good systems engineering and design.

Fuschia is an extraordinarily complex kernel design. Hell, it's a "microkernel" which is bigger than Linux! It's fairly typical of Google's inwardly-focused engineering culture, using questionable tools (with large shadows) such as Bazel and gRPC, which were likely chosen simply because they plug into the Googler engineering culture. At the same time, these decisions give way to a very complex system with heaps of moving parts, in this rube goldberg design which is endemic of Google engineering.

Helios is much, much simpler. The kernel itself will probably clock in at under, say, 20,000 lines of code (for x86_64, at least), and it has a very small syscall API, less than two dozen in the final design. Many of the things Fuschia does in the kernel will be done in userspace on Helios, in the Mercury component, such as service discovery and capability allocation. Finally, Helios is written in Hare, which itself is a much simpler language than C++, and kernel hackers at the very least should be convinced of the argument that the complexity of your implementation language contributes to the complexity of your implementation.


> questionable tools (with large shadows) such as Bazel and gRPC,

Bazel and gRPC, whilst both saddled with problems stemming from Google's inwardly-focused engineering culture (and also just some sub-optimal historical decisions) are not "questionable tools" in the sense that they solve the wrong problems or something else solves the same problems much better.

Blaze/Bazel (and its various clones like Buck) are basically the only general purpose build systems out there that even attempt to do the basic tasks of a build system, namely actually figuring out what has changed and needs to be rebuilt. So of course they are gonna use it, nothing else comes even close (and the main downsides don't apply to them).

Similarly, gRPC has a lot of warts in both how its encoding, type system, API and transport work. But anything in that space that doesn't completely suck (such as cap'n proto) is basically a clone of the core design. Again what else even attempts to solve the core problem of having some backwards/forwards compatibly reasonably efficient rpc, messaging and data storage?


I don't have anything to say about Bazel, but the bit about gRPC assumes that the alternatives are as bad.

The closest alternative is Thrift. It's not as popular, alas, but its design is a lot better: you get to combine various encodings (where protobuf mostly has its binary encoding, json, and the underspecified text for configuration); and you get to pick a transport instead of being forced to use http2 with questionable features. You could run Thrift RPC over tcp with basic framing, or over 0mq, or http, or anything else. I'd also argue that Thrift's data representation is better than Protobuf (nesting) but I'll write about it later.

So protobuf is yet another "worse is better" from google: a ton of warts, choices that might make sense for google itself but not for a lot of other situations (http2!), and still winning over superior alternatives :/


>[Bazel and gRPC] are not "questionable tools" in the sense that they solve the wrong problems

I disagree, and there is a huge philosophical gulf of understanding between my view and those who feel differently.


It is indeed possible that there is a philosophical gulf. However, it is also possible that engineers who dislike Bazel (and love things like Make and CMake) haven't worked in large, multi-language organizations. I'm not aware of the author's history, so I don't want to imply they haven't, I'm saying generally. Bazel is a tool that exists to solve specific problems. And it (and clones) does that miles better than any tool out there.

1. Caching, while maintaining hermeticity and correct dependency tracking.

2. Remote execution - yes, contrary to the author's general focus on small, specialized, C (or similar) tools, products out there have to deal with thousands of dependencies that take forever to build and are nice to offload to a beefy server. They also need to support several different devices and architectures.

3. Actually being able to run tests, use the same caching mechanism on tests, surface those results to CI.

4. Being able to plug in arbitrary languages into the build system, with a sane (subset of Python) DSL to write rules in. Because not every project is written in C/C++, or in a single language.

5. Having very clear separation of build phases so you can't shoot yourself in the foot.

6. A lot of hooks to provide better integration with CI, as well as to collect profiling info from your end users, so you can actually make life better for your engineering org as a member of the developer tools/infrastructure team.

In general, my gripe with Bazel complainers is there is a loud community of them on the Internet whose primary software engineering experience is either in languages that build fast, projects that are small and with few dependencies, or just have a small number of people.

There is a world of software out there beyond the Web and C UNIX utilities. Try doing any high end computer vision, robotics or HPC work and you've to deal with a bunch of dependencies trying to solve very complicated problems. One can discuss whether some of those dependencies are well designed or not, but at the end of the day they are very good at their job and don't have much competition.

Bazel solves a lot of problems for a lot of these people, so calling it "questionable" is quite ignorant.


I have worked with Bazel, in a large, multi-language organization, but most of these points are solving the wrong problem. Often I argue for questioning the problems rather than taking them at face value and building an unnecessarily complex solution. Bazel is designed for managing complexity - and adding tons more in the process - while the right solution is to reduce complexity.


> I disagree, and there is a huge philosophical gulf of understanding between my view and those who feel differently.

Let me ask you two simple questions:

1. I work on a project in a git repo and do a git pull to get the latest changes to the branch. I do the equivalent of `make` with my build system, and encounter a problem (either a build error, or some unexpected bad behavior from the built artifact). Should I be allowed to conclude that someone has messed up, or do I first have to engage in some gyrations to make sure I have a "clean` build and the correct dependencies (git clean -dxf, issue commands to manually check and update dependencies etc.)?

2. I push some code after running the equivalent of `make test`, which passes successfully. A collaborator informs me that after pulling their build is broken or there is some unexpected failure, which does not appear to be a flake. Should I just expect this to happen every now and then and live with it?

If your answer to 1. is "No" and your answer to 2. is "Yes" than there may indeed be a gulf of understanding, but probably not a "philosophical" one. If your answers are "Yes" and "No" respectively, then I'd certainly love to hear what philosophical more aligned general purpose build tools have this property, because I'd be interested in investigating them. The only other tools in this space that I'm currently aware of of even making an attempt in this direction are either different granularity (nix) or unreleased/more of an academic POC (redo/shake).


One tool (actually a build toolchain) that you may want to check then is build2: https://build2.org


What would be a less questionable tool then for these use cases?


As I implied in my answer above, it's the use-cases themselves that are wrong. Helios does message passing completely differently.


> basically the only general purpose build systems out there that even attempt to do the basic tasks of a build system, namely actually figuring out what has changed and needs to be rebuilt.

Extraordinary claim. You will need to explain how Bazel does that and, say, CMake + Ninja don't.


> Extraordinary claim.

If you think this claim is extraordinary and CMake is a counterexample, I don't know what to say. Have you really not have had to manually fuzz around with stale builds (by e.g. issuing "clean" commands, removing build/ directories etc) when using CMake or had to figure out why something worked on your machine but not a colleagues?

> You will need to explain how Bazel does that and, say, CMake + Ninja don't.

CMake makes no serious effort at all at "figuring out what has changed". Even trivial stuff doesn't work. If you have a wildcard pattern in a CMake file and you add a file that matches the wildcard, you will generally get a stale result (which is why it's "not recommended"). So not even explicitly specified inputs work correctly and CMake makes no real attempt to to detect unspecified implicit dependencies. Does your CMake build setup correctly detect when you updated gcc or some system library? That some build artifact implicitly depends on another?

By contrast Bazel goes through some fair amount of effort not only to detect changes to specified dependencies correctly but also to sandbox build steps to check that they don't depend on stuff that has not been explicitly specified. As the blaze docs say:

> When given the same input source code and product configuration, a hermetic build system always returns the same output by isolating the build from changes to the host system.

You can get pretty close with bazel; good luck with CMake.


I might have to make clean with CMake after a distro compiler upgrade where the compiler is behind the same symlink or something, dunno, I don't really remember doing that. Otherwise, no make clean because I understand how CMake works. Really, I do. Including with generated code and all that. Wildcards in CMakeLists.txt are misuse.

System libraries are intentionally disregarded as possibly changed dependencies, that is a design decision that most build systems make. System libraries are not supposed to change in binary incompatible ways without a major version upgrade.

Your last point is about reproducible builds, which is a different topic.


Even [task](https://github.com/go-task/task) (a tiny task-runner tool) does that... I don't know any build system that doesn't.

Perhaps the OP means full incremental compilation, which requires "cooperation" from the compilers, really (or the build tool actually parsing the language's AST like Gradle does, I believe). Or in the case of Bazel, the build author explicitly explaining to the build what the fine-grained dependencies are (I don't use Bazel so I may be wrong, happy to be corrected).


> I don't know any build system that doesn't.

Sure you do. Any build system which has a "clean" command that you'd use for anything other than freeing up space.


And which one is that?


Make, cmake, cargo and npm are all examples of popular build tools that require you to manually "clean" to correct build problems. Make and cmake pretty much all the time, IME, cargo less so, but it happens. Bazel/Blaze and derivatives basically don't.


Curious if you have looked at the Niklaus Wirth based OSes such as that used on the native Oberon and Lilith implementations. Small kernels that ran on what we would today call underpowered systems.


It seems that this project is going for a Unix like kernel with a planned POSIX compatibility layer.

Oberon was going for the 'Make it as simple as possible, but not simpler' mantra. Oberon is about 10.000 lines of code for the complete OS, including its UI, compiler etc.

I love the architecture of Oberon, but today it would be more fitting to use within an E-ink e-reader device.

But Helios could eventually evolve into an OS that runs a mail server or web server efficiently. Or even run an existing UI setup like Wayland/Sway.


The high-level OS will be more like Plan 9 than Unix, but appreciably distinct from either.


Oberon was a great experience, for 1990's computers. Now it is only a source of inspiration.


And Fuchsia has BSD, MIT type licenses, while Helios is GPL3. Thank you, ddevault.


Fuchsia is by Google. This is a one man effort (at this point). Also it is just begun... Fuchsia is already used in some devices by Google. It is simply too early to make any kind of comparison except the implementation language. Fuchsia uses C++ while Helios is using its author's own Hare lang.


  $ git clone https://git.sr.ht/~sircmpwn/helios
  Cloning into 'helios'...
  remote: Enumerating objects: 1820, done.
  remote: Counting objects: 100% (493/493), done.
  remote: Compressing objects: 100% (444/444), done.
  remote: Total 1820 (delta 242), reused 0 (delta 0), pack-reused 1327
  Receiving objects: 100% (1820/1820), 415.45 KiB | 455.00 KiB/s, done.
  Resolving deltas: 100% (867/867), done.
  
  $ cd helios
  
  $ git shortlog -s -n --all --no-merges
  
     174  Drew DeVault
     30  Eyal Sawady
     1  Sebastian
Drew is pretty good at getting folks involved.


Fuchsia uses C++ for the microkernel itself and a mix of Rust, Go and Dart for the remainder OS services.


Fuchsia is having a mainstream web browser ported to it whereas this never will.

"Mainstream": has more than 5000 users on its best day ever.


Fuchsia doesn't really have a community outside of Google.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: