rxv64 is a pedagogical operating system written in Rust that targets multiprocessor x86_64 machines. It is a reimplementation of the xv6 operating system from MIT.
octox is also different in that the kernel, userland, mkfs, and build system are all implemented in Rust, and I think it is also different at the type level, for example, octox using mpmc for Pipe, and using OnceLock and LazyLock for static variable initialization.
Thanks. rxv64 kernel (the much harder part), syslib and ulib are in rust. Replacing userland C with rust would not be hard. Ideally a general purpose OS should support programs written in any of the common programming languages so it would make sense to have some programs in another language to keep the OS API "honest"!
Is it really a "legacy" ISA if it's actively developed by two different companies (and will be for a long time to come) and is utterly dominant in servers, desktops, and laptops today?
You're right in that it is still actively developed by two different companies, and that it is still utterly dominant in servers, desktops, and laptops... today.
Perhaps you would prefer "incumbent ISA"? But that would be a pointless exercise, because RISC-V is inevitable, and the new industry standard.
The intent was to express "not RISC-V" anyway. The parent already specifies which.
> But that would be a pointless exercise, because RISC-V is inevitable, and the new industry standard.
Er. So even if the second part of the sentence were true, calling out the current leader isn't pointless. And the second part of your claim is very much not a foregone conclusion; RISC-V is a standard, and used by the industry, but it's not in anything like a dominant enough position to call it "the industry standard", and its success is certainly not inevitable - RISC-V is today where MIPS was a decade ago (cheap, modestly popular in embedded, lacking in high-end parts, not popular outside of embedded). Now, its trajectory is upwards, it has enough going for it that it could become extremely popular, and certainly I'd like an Open Source option to win - but that's just one possibility, and history is littered with ISAs that were supposed to be the Next Big Thing.
It may very well be by now. RISC-V already has the momentum behind it.
The investment is there, and it's huge, both from governments and private. Major design wins. Multiple incoming very high performance implementations from multiple vendors.
On the software side, I have never seen a new architecture's software ecosystem grow as quickly as RISC-V's has.
>history is littered with ISAs that were supposed to be the Next Big Thing
You're well aware how RISC-V is better suited for success. Both technical, and non-technical aspects. It has the mojo.
I remember when PowerPC and ARM got their moments. This is an order of magnitude bigger. No ISA has ever managed to rally this much support, broad and wide. It's not even close.
Why does it seem like 80 to 90% of hobby OS projects that are started are "Unix-like?" Don't we already have a huge variety of Unix-like OSes out there? Why not explore new models?
Through various market forces, Unix (and its descendants/clones) and Windows killed most of the rest of the OS ecosystem over 20 years ago. There are generations of software engineers and computer scientists who’ve never studied operating systems that weren’t Unix- or Windows-based. Most leading OS textbooks (Tanenbaum’s books, the dinosaur book, Three Easy Pieces) have a Unix slant. Even the systems software research community is heavily Unix-centric; I say that as someone who used to be immersed in the research storage systems community. The only non-Unix or Windows operating systems many practitioners and even researchers may have used in their lives are MS-DOS and the classic Mac OS, and there’s a growing number of people who weren’t even born yet by the time these systems fell out of common use.
However, the history of computing contains examples of other operating systems that didn’t survive the marketplace but have interesting lessons that can apply to improving today’s operating systems. The Unix Hater’s Handbook is a nice example of alternative worlds of computing that were still alive in the 1980s and early 1990s. VMS, IBM mainframe systems, Smalltalk, Symbolics Genera, Xerox Mesa and Cedar, Xerox Interlisp-D, and the Apple Newton were all real-world systems that demonstrate alternatives to the Unix and Windows ways of thinking. Project Oberon is an entire system developed by Wirth (of Pascal fame) whose design goal is to build a complete OS and development environment that is small enough to be understood for pedagogical purposes, similar to MINIX but without any Unix compatibility. Reading the history of failed Apple projects such as the planned Lisp OS for the Newton and the ill-fated Pink/Taligent project are also instructive. Microsoft Research did a lot of interesting research in the 2000s on operating systems implemented in memory-safe languages, notably Singularity and Midori.
From learning about these past projects, we can then imagine future directions for OS design.
Another key issue: the concept of open source barely existed. Sure, you got a copy of the source code used to SYSGEN your RT-11 or RSTS/E system, but it had comments removed and was not for redistribution. In the end, the closest in feel to the old DEC world today would be the traditional Windows command line (not PowerShell).
The rise of GNU and the general open source movement was a reaction to the rug-pull when access to Unix source was restricted, and that gave us Linux (MINIX wasn't initially open source, but could be put into a state that made Linux possible).
The Unix paradigm stuck with us, IMO, because Unix (and later Linux and BSDs) have been ported to so many vastly different architectures. So many other operating systems have never moved beyond (and often died with) their initial platforms.
True enough. CP/M and MS-DOS (which give us the old-school Windows command shell) predate DCL, though; it was a fairly late addition to the PDP-11 operating systems, after it was rolled out in VMS. RSTS/E up to version 8 logged users into BASIC by default, while version 9 made DCL the default run-time system.
So far, historically, no OS has been successful unless (in addition to whatever whiz bang funtionality it had to offer) it provided a process abstraction in which the application thinks it has the real hardware machine to itself. This is the big thing in Unix and others: not how the file system is, or that there are pipes and whatnot: but that you get an address space in which you run machine code for the Intel, SPARC, PPC or whatever. The machine isn't hidden. The OS is in the background; you tap into it via some privileged instruction or trap.
It was not possible to develop just an OS which does this; it needs hardware support. There has to be virtual memory if there is a requirement to juggle multiple such applications each of which thinks it has a whole machine all to itself. There has to be protection: user/supervisor separation, and that protection cannot depend on a trusted compiler, because the application can execute arbitrary machine code produced by a compiler of the developer's choosing (or even by hand). It has to be hardware.
Okay, so successful OSs so far have been ones that provide decent multiprogramming, by following the hardware's model for virtual memory and privilege separation. What about the rest of the shape of the OS; the environment? Like files being just arrays of bytes?
Files being arrays of bytes is a form of democracy. Any sort of structured storage introduced on one OS will look indistinguishable from a proprietary file format which silos your data.
Files are exchanged between operating systems in bytes. No matter what you do, that's not going to go away.
Even some modern enhancements and complications in file representation cause interoperability headaches for the users, when they need to take their files to a different system.
The rich object features are provided by applications. For instance, an object-oriented database can just store its stuff in a block of bytes. Same with a graphical office suite, or a Lisp system saving its image.
Nobody wants an opinionated operating system in which files are fancy trees of objects done in One Way that everyone has to follow who codes for that operating system.
> Nobody wants an opinionated operating system in which files are fancy trees of objects done in One Way that everyone has to follow who codes for that operating system.
And yet there is ChromeOS. There is iOS. There is Android. These are highly opinionated platforms. So I think that yes, people do want these things. Developers in particular want lots of high level APIs because it makes software production easier, which is why so much software targets the web (effectively an "OS" these days). The web follows implements zero of the standard UNIX API primitives yet is very successful.
There's a lot of stuff that can be done with operating systems research, I'm really pretty sure of it. That doesn't mean you can't layer things properly, or that you can't implement your "OS" also as an app in the same way that Chrome is both an app and an OS. We're constrained by lack of imagination and the difficulties in getting market adoption for new approaches. Developers prefer to be locked in to a multi-vendor API with open source implementations, than a single vendor API with a proprietary implementation, even though they'll do it when a market is there (mobile apps). But that preference kills the commercial incentives to innovate, and open source devs just won't do it, which is why every new OS project you see is a clone of commercial operating systems from the past.
When I realized that the linux operating system had to have hardware support (traps for syscalls and interrupts), it was eye opening.
I guess your post highlights how operating systems and hardware developed together, and to innovate outside of what we currently have there is a need to somehow jump outside of this loop.
What are the hardware features are modern operating systems reliant on and the hardware features they aren't reliant on?
How would one go about exploring “other” real-world systems? I’ve read some old OS textbooks, poked at Symbian and OS/2 books some, and have texts on Oberon and Symbolics in my queue, but the docs for RSX-11 and VMS seem to bury me in operational minutiae without really explaining the design choices, and the Multics docs look like a huge pile of research notes, which is going a bit too far in the other direction. The current bytecode on IBM i is apparentily outright NDA’d, and the RPG docs are eager to presume I know how to operate the original punch-card tabulators. Any pointers?
Depending on the system this could range from an easy project that one can do in a few hours to a very long quest. For some research systems (particularly those implemented in the pre-FOSS era or those from companies) it may be impossible to try out the systems without attempting to implement them yourself. However, there are other systems that are available to try out. The ideal situation is a FOSS release, like the MIT CADR Lisp machine code and Interlisp-D. Barring that, the next best thing is some type of freeware or non-commercial license; I think this is the case for OpenVMS, though I could be mistaken. Some systems cannot be obtained easily without traveling the high seas, if you catch my drift, matey (cough Open Genera cough).
I think a lot of value can be gained from not only using software, but by reading papers and manuals, especially when the software is unavailable or unattainable. There’s a site called Bitsavers that is a treasure trove of old documents.
Come to think of it, a significant reason for Unix’s dominance in research and education is its availability, going all the way back to the 1970s when Bell Labs sold licenses to universities at very low prices. Even when licensing became more restrictive in the 1980s, this spurred the development of MINIX, GNU, later BSDs, and finally the Linux kernel, in order to recreate the environment students enjoyed with 1970s Unix. This openness is a far cry from the work of Xerox and Lisp machine vendors, where one needed to have been privileged enough to work for one of these vendors or their customers to use these environments, which were far more expensive than Unix workstations and especially personal computers. Thankfully there’s a wealth of documentation about these non-Unix systems, as well as testimony from their users. In addition, some systems were open sourced. But we must remember the times that these systems emerged, and we must remember why Unix became so dominant in the first place; its openness and availability set it apart from its competitors.
As an amateur OS nerd by 'exploring' do you mean use or read about or both? :)
You can get free access to play around on an IBM i (aka AS400/iSeries/i5) here [0].
Archive.org has (for the moment) Inside the AS/400 [1] by Frank Soltis. He was the father of the System/38 /AS400, and also worked on the Cell processor. I haven't read this edition, but I did read the following edition called 'Fortress Rochester', and it is a good and detailed overview of the system - history, design choices, and hardware/software.
As for RPG, it was made to be familiar to punch-card operators/programmers so they could switch to a 'real computer'.
bitsavers has a lot of info [2].
A tech report on the Tandem Nonstop Guardian OS / system [3] was interesting.
Your best bet is probably the old MVS that was open sourced back in the 70s or 80s and lives on via an emulator called Hercules. Its sufficiently different to be worth a look, it's got a living direct descendent (although that has become increasingly turned into a Unix over the years), and there is enough of a community around it due to it having a very expensive direct descendent that the docs and other tools necessary for working with it are relatively high quality and up to date in comparison to most long dead operating systems.
The issue with it is that there is no way around having to do some pretty serious assembly language work for an instruction set that no longer exists and whose modern descendents probably microcode to hell in order to run some of the old instructions.
"Introduction to Operating System Abstractions using Plan 9 from Bell Labs" is the best operating system book I've seen - and it works through Plan9: https://archive.org/details/plan9designintro It isn't quite real world as Plan9 was never used for serious production, but it is mind opening and doesn't wallow in minutiae.
Because we effectively live in a Unix monoculture world, in terms of operating systems that you can actually use and study the inner workings of.
It was a legal anomaly that resulted in early Unix (aka Research Unix) being distributed with its source code for free to universities, which was enough to get the ball rolling on stuff like BSD and Lion's annotated V6 source code, that by the time AT&T decided that closed source commercial Unix was the game it wanted to play, the cat was already out of the bag.
By the time the free software and open source movements had gained some ground, enough people had studied or worked on some kind of Unix kernel and Userland source code that projects like Linux, Minix, and Free/Net/Open BSD were feasible. The fact that Linux running on x86 subsequently ate the world was probably something few people saw coming.
The other lineages of operating systems, e.g. Windows NT, OpenVMS, IBM's various offerings, either never have their source released or only have their source released long after they're obsolete.
What is an OS good for if not to run programs? You either port existing programs to your API (lots of work) or design the OS with an existing API in mind. Or forget about existing programs and write entirely new programs
"Unix-like" describes the second approach, you provide a POSIX API but can have different internals
Now, maybe you wanted to have two APIs: one native API for the apps you write, and a translator from another, more popular API to your native API. Sounds like a lot of work when almost all software that will ever run on your OS will the translator
Often the user-space API limits the design of internals and available features. I for one would like to see the classical hierarchical filesystem gone, but POSIX (and UNIX) at its core is about the filesystem.
If you need to run the existing programs, feel free to develop a translation layer, at the same time allowing a new type of OS paradigm to emerge.
Yes please, something that doesn't have files, or processes. Both are too low level and require us to write or use enormous amounts of code to do very simple tasks that are performed by almost all applications. Very wasteful.
But lets stay away from adding another layer for our nice new abstraction. A new layer means the old ones are still there are will be used because programmers are lazy. It's usually easier to reuse someone else's code (that access the old layers) or reuse old methods and algorithms. In both cases, we're moving back towards the old stuff, the stuff we want to leave behind.
Here's a few ideas I've been stockpiling for the day I retire and my wife will hopefully let me just fiddle with operating systems all day. None of these necessarily would work well or make sense, they're just stuff that would be satisfying to explore. NB: All this can be done in userspace with enough hacks.
1. Make strong file typing work well. Filesystems ignore the question of type, leaving it to ad-hoc conventions like extensions. Figuring out what a file really is takes a lot of effort, is often duplicated by apps and the OS and has a long history of introducing security vulnerabilities. There's also no first class notion of "type casting". Format conversion is delegated to an ad-hoc and inconsistent set of tools which you have to learn about and obtain yourself, even though it's a very common need.
2. Unify files and directories. Make open() and read() work on directories. You need (1) for this.
A lot of cruft and complexity in computing comes from the fact that most things only understand byte arrays (e.g. http, email attachments...), not directories. Directories have no native serialization in any OS, unless you want to stretch and call DMG a fundamental part of macOS. Instead it's delegated to an ancient set of utilities and "archive" formats. As a consequence a big part of app development has historically been about coming up with use-case specific file formats, many of which are ad hoc pseudo-filesystems. This task is hard and tedious! There are lots of badly designed file formats out there, and my experience is that these days very few devs actually know how to design file formats, which is part of why so much computing has migrated to cloud SaaS (where you ignore files and work only with databases). Many new file formats are just ZIPs.
An operating system (or userspace "operating environment") could fix this by redefining some cases that are errors in POSIX today to be well defined operations. For example given a file that has some sort of hierarchical inner structure like a .zip (.jar, .docx), .csv, .json, .java, .css and so on, allow it to be both read as a byte array but also allow you to list it like a directory. Plugins would implement a lossless two-way conversion, and the OS would convert back and forth behind the scenes. So for example you could write:
echo red > styles.css/.main-content/background-color
and the CSS file would be updated. Likewise, you could do:
The trick here is to try and retain compatibility with as much software as possible whilst meaningfully upgrading it with new functionality. Redefining error cases is one way to do this (because you have some confidence they won't be important to existing programs).
There are UI issues to solve. You'd probably want some notion of an item having "bias", so explorers can decide whether it makes more sense to e.g. open a clicked icon in an app or explore inside it. You'd need to find ways to express the duality in a GUI nicely and so on.
But if you get it right it would allow app devs to think of their data in terms of tiny files and then let the OS handle the annoying aspects like what happens when you drag such a directory onto an email. Apple/NeXT tried to do this with their bundles but it never really worked because it wasn't integrated into the core OS in any way, it was just a UI hack in the Finder. For example early iLife apps represented documents as bundles, but eventually they gave up and did a file format because moving iLife documents around was just too difficult in a world of protocols that don't understand directories.
3. Ghost files.
There's an obvious question here of how to handle data that can be worked with in many formats. For example, images often need to be converted back and forth and the conversion may not be lossless. I want to find a syntax that lets you "cast" using just names. The obvious and most compatible approach is that every format the system understands is always present, and filled out on demand. Example:
$ take /tmp/foo
$ wget https://www.example.com/face.jpg
$ ls
face.jpg
$ imgcat face.webp
<now you see the face>
In this case, the OS has a notion of files that could exist but currently don't, and which don't appear in directory listings. Instead they are summoned on demand by the act of trying to open them for reading. You could imagine a few different UXs for handling divergence here, i.e. what if you edit the JPG after you accessed the WEBP version.
4. Unify KV stores and the filesystem
Transactional sorted KV stores are a very general and powerful primitive. A few things make filesystems not a perfect replacement for RocksDB. One is that a file is a heavyweight thing. Not only must you go to kernel mode to get it, but every file has things like permissions, mtimes, ctimes and so on, which for lightweight key/value pairs is overkill. So I'd like a way to mark a directory as "lite". Inside a lite directory files don't have stored metadata (attempting to querying them always returns dummy values or errors), and you can't create subdirectories either. Instead it's basically a KV store and the FS implementation uses an efficient KV store approach, like an LSM tree. Reading such a directory as a byte stream gives you an SSTable.
Also, contemporary kernels don't have any notion of range scans. If you write `ls foo*` then that's expanded by the shell, not filtered out by doing an efficient partial scan inside the kernel, so you can get nonsense like running out of command line space (especially easy on Windows). But to unify FS and KV stores you need efficient range scans.
There have been attempts at transactional filing systems - NTFS does this. But it's never worked well and is deprecated. Part of the reason UNIX doesn't have this is because the filesystem is a composite of different underlying storage engines, so to do transactionality at the level of the whole FS view you'd need 2PC between very different pieces of code and maybe even machines, which is quite hard. Lite directories, being as they are fixed in one place, would support transactions within them.
5. Transient directories
Free disk space management is a constant PITA in most operating systems. Disk space fills up and then you're kicked to a variety of third party cleaner tools. It sucks, especially as most of your storage is probably just a local cache of stuff generated or obtained from elsewhere.
In my research OS/OE, "filectories" or whatever they're called can be tagged with expiry times, notions of priority, and URLs+hashes. The OS indexes these and when free disk space runs out the OS will do start deleting the lowest priority files to free up space. Temporary files go first, then files that were downloaded but weren't used for a while (re-downloaded on demand), and so on.
In such an OS you wouldn't have a clear notion of free disk space. Instead as you ran out of disk space, rare operations would just get slower. You could also arrange for stuff to be evicted to remote caches or external drives instead of being deleted.
Objects are not serialised and stored in files, they exist (only) as the fundamental entity of the OS. These objects are not read/saved from disk, they exist until destroyed and are managed transparently by the OS - similar to the manner to which virtual memory pages are transparently managed by Unix/Linux.
Objects represent reasonably high level entities, perhaps images, sounds, etc, but probably not individual integers or strings. Objects may reference ("contain") other objects.
Objects are strongly typed and implement common sets of abilities. All picture objects can show themselves, duplicate themselves, etc.
I've moved away from this idea over time even though it's intuitively attractive:
1. OOP is closely tied to language semantics but languages disagree on exactly what objects, methods and type signatures are, and programmers disagree on what languages they'd like to use.
2. From the end user's perspective it's often useful to separate code and data.
This isn't an anti-OOP position. I use OOP for my own programming and it serves me well. And modern operating systems are all strongly OOP albeit mostly for communication between micro-services rather than working with data.
One reason the files+apps paradigm dominates and not objects (despite several attempts at changing that) is because it allows apps and formats to compete in a loosely coupled market. I don't want to be tied to a specific image editor because I have an image object that happens to be created by some primitive thing, I want to work with the pixels using Photoshop. The concept of files, apps and file associations lets me do that even though it's not well supported at the bottom layers of the OS stack.
But OOP is fundamentally about combining code and data. So an image object, in this context, would have to be something more like a codec implementation that implements an IPixels interface. But again, even then, why would the two be combined tightly? What if I want to swap in a faster codec implementation that I found?
Follow this reasoning to its conclusion and you decide that what's needed is actually an OS that can transparently convert data into various different static formats on the fly, and which has a much deeper and more sophisticated concept of file associations. PNG<->JPEG should be automatic obviously but also, one of those formats you convert to might for instance be dynamically loadable code or a serialized object graph. For example, imagine you have an image viewer written in Kotlin or Java. You also have a complex file that nonetheless meets some common standard, maybe a new proprietary image format, and you'd like your general image viewer to be able to load it directly without conversion to some intermediate format. Then you could write a converter that "converts" the image to a Java JAR which is then automatically dynamically loaded by the OS frameworks and asked to return an object graph that conforms to some interface:
var image = ImageRenderer.load("foo.compleximage")
if (image instanceof WidgetFactory) { /* put it into a gui */ }
Java already has a framework for abstracting images of course. The trick here is that the app itself doesn't have to actually have a plugin for this new format. Nor does the OS need to be specific to Java. Instead the OS just needs a framework for format conversion, and one of the formats can be dynamically synthesized code instead of what we conventionally think of as a file format.
- process management functions take PIDs and not capabilities, which can cause race conditions
- POSIX hasn't standardized anything better than poll(), yes it works fine in a hobby context but it's not 1987 anymore (and don't get me started on select(): https://github.com/SerenityOS/serenity/pull/11229)
- the C POSIX library has a lot of cruft while also missing stuff what programmers actually need (for example, POSIX took nearly 6 years to standardize strlcat()/strlcpy(), the process itself starting 17 years after OpenBSD introduced those functions: https://www.austingroupbugs.net/view.php?id=986)
- ...
Granted, modern production-grade Unix-like operating systems have extensions to deal with most of these issues (posix_spawn, kqueue, pidfd_open...), but they are often non-standard and can be quite janky at times (dnotify, SIGIO...). It also doesn't fix the huge amount of code out there using the legacy facilities like it's still the 1980s.
There are other models out there (Plan 9, Windows, Fuchsia...), but what we really need is to stop putting Unix/POSIX on a pedestal like some holy scripture that shall not be questioned. It's the pinnacle of 1970s operating system designs and it has fossilized so much it's actively turning into oil.
Or at the very least, please stop teaching the next generation that fork() is the greatest thing ever. It's a 50 year old hack kept alive through gratuitous amounts of copy-on-write that should've been scrapped the day Unix was ported to computers with virtual memory and paging.
At this point UNIX == Linux for the overwhelming vast majority of users/systems out there. I'm really not a fan of this monoculture, but Linux is quite efficient and there are benefits to everyone using more or less the same platform.
POSIX isn't a great standard at all, but it doesn't really have to be. It's more or less been the "lowest common denominator" for a system for awhile, and it's "ok" at that.
Yes - Putting UNIX on a pedestal means that it is basically correct and you just need additions rather than looking at it and saying that UNIX does this badly or even incorrectly and so make incompatabler chnages to improve things.
I think part of the issue is that unix does some things correctly (everything is a file comes to mind) and that implementing them without implementing the rest of unix just means you've made a unix-like that's not compatible with normal unix, as opposed to making something new and using the good parts of things you encouter.
That's not my experience as a French student in the early 2010s. At the very least, I remember that the only process creation model I saw in class was fork()+exec() and there were no disclaimers about it.
I think it makes more sense when you consider that fork was originally the concurrency primitive in Unix. Threads came later. If you want to make some concurrent program, it makes sense to fork without exec. The fact that you can spawn an entirely distinct process with the fork+exec pattern is kind of a coincidence.
I guess spawn-like or invoke-like things are a bit more intuitive at first but they break down when you're out of the trivial case and you have to clean up some but not all resources in a certain way and keep some others to pass on before calling the actual child code.
Then fork starts to make sense as you can do the preparation phase unrestricted by what the spawn API allows and be sure to affect only the child before exec'ing. Probably changing this would require a significantly different resource management model (which is a nice thing to explore)
In some ways, fork+exec sucks, but in others it's also exactly the right tool for some jobs. There's some conceptual simplicity and beauty to it that reminds me of lisp's homoiconicity, where things can be broken down to four fundamental elements (read eval print loop)
My OS class definitely didn't characterize fork() as maligned, just as a powerful building block (which it is). It took me professional experience to realize it sucks.
We created a funky little OS on top of seL4 & Rust that's most certainly not Unix-like, and is more akin to an RTOS-like approach to building & bundling software. More for purpose-built appliances than a general purpose OS.
Some time later we had another customer interested in using it and having us add some features to it (e.g. some device drivers and a persistence layer utilizing https://docs.rs/tickv/latest/tickv/). It was becoming a massive pain in the neck to work out source code sharing agreements with them, so we decided to just open source it.
There are quite a number of things that we would do differently if we had to build it again, and at some point will likely do that work to revise it. The biggest one of those is root task synthesis. The other is to build and bring in facilities for running tasks that are compiled to WASM.
Somewhat humorously, the fact that doing system & integration testing was irritatingly challenging with a combination of FerrOS (which locks down as much as possible at runtime), and black-box binaries that couldn't be changed, played a role in us leaning pretty hard into using trace-based testing & verification techniques for our distributed systems & robotics testing products.
Because of the availability of the source code under a permissive licence that UNIX has been distributed under for a long time. Other operating systems source code still remains unavailable, sometimes decades after the hardware they used to run on had disappeared.
The second reason is the simplicity of abstractions and the ease of their implementation. In the original DEC documentation on RSX-11M (the predecessor of VAX VMS), for example, there is a whole chapter describing how to create and fill in a file descriptor block (a complex record data structure) required just to open a file where the user has to decide beforehand where to locate the file, whether they want to access the data [blocks] randomly or sequentially, whether the file is being opened merely for updating the data but not extending the file size or for updating the data and extending the file, whether allow or not other processes to access the same file whilst it is open, the number of in-memory buffers for the kernel to allocate for file access etc etc. Many complex decisions have to be made before a file can be opened. In UNIX, on the other hand, it is a mere «int fd = open ("myfile", O_RDONLY);», the end.
Granted, not every OS has had such complexities (the opposite is also true, tho). Yet, the simplified (one can argue that it has been oversimplified) UNIX abstractions have been influential for a reason.
Because the only large, available, free ecosystem of software (both applications and hardware drivers) is completely built around the Unix abstractions.
If you don't want to do Unix, you have to reduplicate ALL of it. And that's like trying to boil the ocean.
If you're not that serious about it, it's a way to get a bit more for free?
Contrast Redux, which is quite a serious (not in the 'we compete with Windows' sense, but still) project I gather, which... I don't know how they describe it, but it's sort of Unix-ish, Unix-rethought? I've never actually played with it, but loved the idea of 'everything is a url' (not file) when I heard it described & explained on The Bike Shed podcast I think.
Unix is a lot more amenable than most older OSes to being implemented by a disparate group of people with limited communication, hence why GNU was originally set up as a reimplementation of unix.
I think the main selling point for Unix-like OS, but in Rust, is focused on Rust. It's to ensure that the memory related errors are less likely, and hopefully with enough work, the system can be essentially what we have today, but with less CVEs.
It's honestly a decent goal and I'm in support of it.
I know that there will inevitably be many now that come to state the obvious "well it doesn't guarantee safety" and "there are other reasons for CVE", etc. Nonetheless, it's not a bad idea.
Why does it seem like 80 to 90% of hobby OS projects that are announced here as "Unix-like" invariably get someone who replies with "that thing you're doing as a hobby, for fun...you're doing it wrong and I don't approve". Zero content, zero insight posts about someones else's toy aren't useful. You want a 'new model', start coding.
All kinds of new OS ideas can be implemented on UNIX like Mach[1], FLASK[2] and 9P[3] while internally a UNIX-like system doesn't need to be anything like a UNIX[4]... So who cares? What are you worried about losing? What can't be implemented on a UNIX-like system?
1. see MacOS
2. see SELinux
3. see v9fs
4. see Windows and BeOS which both have POSIX APIs
The problem is that anything you build on top of Unix will always be a second class citizen in the Unix world.
For example, suppose you want a database-like filesystem. Either you implement it in the kernel, and now your special apps barely work on anyone’s computers. Or you implement it in userspace - preferably as a library. And now your apps can run anywhere without special kernel features but the terminal, and all the other applications on the computer can’t / won’t understand your new abstraction. And you’ll be fighting an uphill battle to get anyone to care about your new thing, let alone integrate it.
It’s like saying - why rust? Why not just add a borrow checker to C? Why didn’t C# just add a garbage collector to C++? Sometimes starting fresh and establishing a clear, clean environment with different norms is the most effective way to make something new. You don’t have to fight as many battles. You can remove obsolete things. You don’t have to fight with the platform conventions, or fight the old guard who like things as they are.
It’s a shame with operating systems that modern device drivers are so complicated. Hobbyist operating systems seem inordinately difficult to make these days as a result, and that’s a pity. There’s all sorts of good ideas out there that I’d love to see explored.
> The problem is that anything you build on top of Unix will always be a second class citizen in the Unix world.
Is first-class support required? Even things as fundamental as executing programs (the elf loader, for example) can be a second-class citizen without users noticing.
> For example, suppose you want a database-like filesystem. Either you implement it in the kernel, and now your special apps barely work on anyone’s computers. Or you implement it in userspace - preferably as a library. And now your apps can run anywhere without special kernel features but the terminal, and all the other applications on the computer can’t / won’t understand your new abstraction.
So implement it in the kernel anyway (write the driver). How does having a whole new OS in which this particular filesystem is first-class help? It doesn't help your argument that all filesystem drivers, by your definitions, are second-class citizens, and no one cares.
I'm actually rather keen to know what downside there is for not trying out your new idea in an existing OS.
After all, if your new idea is any good the existing OSes will adopt it anyway making it pointless for newcomers to try your new OS.
This conversation is tricky because there are technical questions (what is in userland vs in kernel?). But the real question of whether something is a first class citizen is whether it’s part of the ecosystem, such that essentially all software can depend on it being available. The elf loader is clearly part of the Linux ecosystem, regardless of whether it’s in userland or kernel space. All shipped Linux software just assumes elf is part of the system. Zfs is not, even though you can apparently get it to run as a kernel module.
> I'm actually rather keen to know what downside there is for not trying out your new idea in an existing OS.
It might be more fun. It might be easier to experiment, since you don’t need to read or change as much code. And it might be easier to make a new community than it is to convince people in the existing community to take your patches seriously. - Eg like what happens in programming languages.
Long running software projects like Linux are conservative - and for good reason. But the result is that there’s a lot of potentially good OS ideas that Linux will never adopt at this point its lifecycle. (Eg, “What if everything wasn’t a file?” / “what if we formally verified all the code in the kernel?” / etc.)
Like every I-Love-Coding-But-My-Work-Is-Boring developer, I've toyed with various technical ideas. Some, like playing around with language design, actually have a usable release. Others, like "hey, I want to write my own operating system" frequently boot up and then never go anywhere.
And I agree with you - what would the experience be when (for example) the SQL RDBMS is the operating system? Maybe I'd like to try it out and see.
If you want to modify an existing OS to make your own, Linux is not as easy to start with than FreeBSD (I've had experience only with those two, as far as modification of the OS goes).
Sure, there's some nice hooks into the kernel, but FreeBSD just seems so much more cohesive and easier to understand (might be due to how well it is documented, maybe).
With all that said, I am definitely going to have my next experiment done on NetBSD, which differs substantially in that it is a Rump Kernel (https://en.wikipedia.org/wiki/Rump_kernel) which seems even easier to hook my own OS stuff into.
Maybe NetBSD is an option for you if you want to produce a new OS with a feature that cannot be seamlessly grafted onto an existing OS. Fair warning, I haven't actually tried this on NetBSD yet, but it looks more doable (to me, and I'm an amateur at kernel dev) than any of the existing alternatives.
Nobody's stopping you from doing that but apparently none of OP's ideas required that, nor has it been worth it for anyone else's yet either. If it ain't broke, don't fix it. Let me know when it's actually broke in this actual reality. Then we can start over*.
Also all those languages are Cs in the same way BSD and Linux are UNIXs. Same family. You should have mentioned Haskell or APL instead.
* Note that many experiments did start over, e.g. Plan9, but were then integrated into a UNIX.
> Also all those languages are Cs in the same way BSD and Linux are UNIXs. Same family. You should have mentioned Haskell or APL instead.
But look! Even though those programming languages are in the same family, it was still worth starting over when they were made! Zig and C are incredibly similar languages, but Andrew Kelly didn’t try and convince the C standards committee to adopt his ideas. He just went and made Zig from scratch. And I’m glad he did! It would have taken decades to drag C in that direction - if it’s possible at all.
Another example: Khtml wasn’t based on Firefox (the big contemporary opensource browser). It was a new browser engine with new ideas. And the design was so good it was used as the basis of Chrome.
Don’t get me wrong - I think it’s great that plan9 experiments did eventually make it back into Linux. But doing the experiments in a separate kernel / OS still makes a lot of sense to me. Old, established technology like linux, FreeBSD, the C programming language, or something like the HTTP spec all need to move slowly because they’re depended on by so many people and companies. That is anathema to wild, new ideas.
>For example, suppose you want a database-like filesystem. Either you implement it in the kernel, and now your special apps barely work on anyone’s computers.
Filesystems already are databases. They organize and catalog data, and provide various other metadata about the data stored in them (creation and modification times, permissions, etc.).
Many people have invented various other databases, some SQL, others noSQL, which all require special apps to use. Many of these have been successful.
The filesystem is a crappy database. Its purely hierarchical nature requires dirty compromises for things like music libraries, where you want songs to be indexed both by song name and by album. Filesystems are also lacking atomic update mechanisms (transactions). There’s no way to enforce data integrity - files are weakly typed. You need fsync on linux to know that your data has been written at all - but fsync is horribly slow. And even fsync doesn't save you from data corruption due to skewed writes.
You can use a userland database on top of linux. But that has different properties from the OS’s actual filesystem for ecosystem reasons. A Postgres instance will never be a first class citizen on Unix. I can’t “cd” into a sql table in my terminal, or use sql queries to query procfs or /etc. What would it mean to pipe into a table? Does that abstraction even make sense? An operating system built on top of a database would be different from unix because the ecosystem of userland applications would evolve in a different direction. See, for example, HaikuOS.
I’m not saying it’s a good or a bad idea. But simply firing up mongodb on a Linux server isn’t the same thing as building the whole OS with a real database at its core.
>The filesystem is a crappy database. Its purely hierarchical nature
You realize the first databases were hierarchical, not relational, right? Filesystems do well enough in this regard. They're not meant to store tons of metadata, which varies depending on your application.
>requires dirty compromises for things like music libraries, where songs should be indexed both by song name and by album.
What about songs that aren't on albums? What about live performances? What about cover songs? Who gets the credit, the performer or the composer? People's opinions about these things keep changing, which is why the original MP3 tag format was so bad, and had to be replaced by a newer format. Imagine if we were all stuck, forever, with what some clueless people in 1995 thought was good enough for MP3 tag info, for all digital music.
And why should this info be in a filesystem anyway? Most files are not digital music.
>A Postgres instance will never be a first class citizen on Unix, so I can’t “cd” into a sql table in my terminal, or use sql queries to query procfs or /etc.
Right, because then the OS would have had to be designed for that kind of thing from the start, and you'd never be able to change it afterwards. This is why we keep things minimal at the lower levels, because then you can change things easily at the higher levels later on as needs change. Postgres itself has changed a lot in the last 10 years, adding lots of capabilities; if that were baked into the OS, that wouldn't have been so easy. And what if you decide you want something different from SQL? Sorry, you're stuck with it because it's baked into the OS, so now someone else is going to complain about how our ecosystem could evolve in a different way if we adopted some other type of database paradigm.
The benefits you claim just aren't worth the cost. It's easy enough to implement a database on top of a modern OS, and then use tools and applications designed for it to interact with it.
> You realize the first databases were hierarchical, not relational, right?
Yes. Obviously, the first databases humanity ever made were also the worst databases humanity ever made. With the possible exception of mongodb.
The choices Linux made in 1980 made sense at the time. But 40 years is a long time! It is as you say - databases have improved a lot in that time. I don’t know if Linux built on top of a modern database would be better or worse. How could we know unless somebody tries it?
Because the effort to build such a thing would be gargantuan. Think of all the work that went into the Linux kernel plus all the userspace programs and applications on top of it: you want to recreate all that effort because of a hunch?
I'm sure some people with more expertise in theoretical CS than me can tell you better why this is a bad idea, but consider we already have databases now, and different databases work better for different tasks than others. How is baking a database into your filesystem going to compete? What if you pick the wrong one? What if it sucks for certain use-cases that current systems (filesystem+DB) work better at?
Every time someone's tried getting better efficiency by baking things in at a low level, it hasn't worked out too well, because by forcing a standard that way, it prevents innovation (e.g., with your DB-as-filesystem, when everyone decides they want to work with JSON right in their DB, it can't be done with yours because it wasn't designed that way and it can't be bolted on because it'll break things, but with Postgres it's easy to add in).
Doing a lot of work on a hunch that it might be better is the basis of all science. And all progress in general. Building a fully verified operating system kernel was a massive amount of work - but the SeL4 team still did it because they thought it was a good idea and got funding. (And as I understand it, their OS gets a lot of use in things the baseband chips in cell phones).
Yes, making a new toy operating system is a lot of work. But we don’t need to reinvent all of the software that has been built on Linux to tell if its a promising idea. Just enough to learn and see if the juice is worth the squeeze. And maybe have a little fun along the way.
In general I think it’s really sad how little innovation there is now in the OS space, simply because of how dominant Linux is and how much work it takes to make something yourself. How many good ideas are waiting in the wings because it would take too much effort to try them out? What a pity!
>In general I think it’s really sad how little innovation there is now in the OS space, simply because of how dominant Linux is and how much work it takes to make something yourself. How many good ideas are waiting in the wings because it would take too much effort to try them out? What a pity!
People are coming up with all kinds of innovations in computing, just not so much in the OS space because it's considered a solved problem. There's tons of stuff going on at much higher levels, and has been for a long time: virtualization, containerization, microservices, etc. The low-level building blocks are "good enough" for the higher-order things people want to try out now.
We've seen this in many domains: once you have something that works well enough, it's hard to justify effort to optimize it more, when there's other problems to be solved.
Because if you manage to get to some level of POSIX compatibility you can leverage that into having a whole toolchain and lots of other goodies up and running in a relatively short time. This limits the amount of effort required to get to 'first base', a self hosting environment.
At both extreme opposite ends of the scale/funding/people spectrum we already have TempleOS and Fuscia, and probably countless in between. You tell me why they aren't going anywhere even though any properties you might say about one, the other has the opposite quality and is also going nowhere.
Maybe "unix-like" is really just a principle that has no expiration date, like "murder is wrong".
Depending on how slavishly you define "unix-like", for instance, I would not say that the principle philosophy dictates there shall always be a command named "rm" that takes these options and does this task a la posix specs.
But for today and certainly any forseeable time, it's perfectly useful to "merely" reimplement posix.
OS/400 is actually really interesting and very well thought out, particularly as an example of "Not Unix". I wouldn't want to make my living there, but lots of people do.
Oh you have did what I did in the shadow...I wonder if I later GPL'd it the license won't be compatible to take the code in...But I runs in x86_64 with custom QEMU UEFI loader anyway
Both those licenses require you to add the license text as a file to the codebase. Eg see the "How to apply the Apache License to your work" section in the Apache license link that you have there.
Since it's dual-licensed you can add one as LICENSE-MIT and the other as LICENSE-APACHE.
This is arguable and I think overall it's actually harder to correctly write unsafe Rust, even if sometimes maybe in some sense safer than C when you screw up.
In Rust everything has to obey Rust's semantic constraints. For safe Rust that's fine because the language itself promises you're obeying. You can't introduce anything which would be a problem, so you needn't even care what those problems are.
But in unsafe Rust you are responsible for the same guarantees that safe Rust gave everybody. And the rules you're responsible for obeying are truly difficult so that you may not properly understand them. If you screw up, that's instantly Undefined Behaviour.
Let's take a fairly old but brutal real example from Rust's standard library. core::mem::uninitialized<T>(). This function is labelled deprecated (as well as unsafe) in your Rust, but once upon a time it was the usual way to make some uninitialized buffer in which to construct something.
But it was actually UB almost always†. Because what it says is, OK, I know I didn't initialize a T, but trust me, I'll sort that out later, lets say this is a T anyway. And for a time people persuaded themselves that this is OK for at least some types. After all, if T was u8 (a byte) then who cares what its value is, any value is valid, isn't it? Well, yes, but "uninitialized" isn't a value, it's a 257th possible state, the compiler knows we didn't initialize this, and therefore all optimisations are valid even if they wouldn't be valid for any possible initialized state of the memory - we didn't initialize it so we're not entitled to assume it had any of those values.
In C you will get away with this but in Rust you've created Undefined Behaviour, which is not OK. Today you would use the MaybeUninit<T> type so that you can explicitly initialize it (once you have something to initialize it with) and then MaybeUninit::assume_init() to get your T instead now that it's initialized, and (if you did it correctly) that is safe.
† If T is a Zero Size Type then this function isn't dangerous, because it makes nothing and then says this nothing is actually a T, and the compiler says well, thanks for telling me, I don't really care but whatever. No UB. Likely this only happens in generic code, but it's safe.
part of the problem is that rust has not yet a standardized memory model (there are candidates, wip)
this means there are limits to soundness analysis tools and guardrails you can provide ins table rust
through there had been pretty convincing examples about how under some of the (more promising) memory model candidates you can provide additional/different functions which are much harder to accidentally misuse
and soundness analysis tools do already exist, too
I believe that rust has the _potential_ to make it easier to write a lot of unsafe code correctly in rust then in C -- in the future.
Through the issue with people using a "it's only bits" mind set when doing unsafe code stays around, and is wrong, not just in rust but in C, too. No matter how much some people try to pretend C is a high level assembly.
> Some invariants you cannot break in rust, no matter if "safe" or "unsafe"
a better description IMHO is that unsafe rust enables additional functions which are normally not usable as they can create unsoundness if used incorrectly
because this are just additional functions it means also the checks and type safety of all other code is still always there and normally sound, as long as you don't misuse the additional functions to brake invariants
I do count pointer dereferencing as a function converting a pointer to a reference in this context, it's technically not quite right, but conceptually not really wrong either.
So e.g. unsafe rust don't allow you to write to a &T (immutable reference), but using unsafe you could technically bit-wise transmute the &T into a &mut T (100% guaranteed unsound!!!) and write to that. Still at any point all constraints when handling &T did still apply and so do constraints of handling &mut T, you just (unsoundly) converted one into another using an additional function unlocked by using unsafe.
Yes! Another apt name would be "trustme". (The normal case in Rust is trust the compiler - and all the people who wrote "trustme" code that you depend on!)
Not related to the project. Most of the code is unsafe. I really find rust counterintuitive.
This is just a note to myself.
* rust uses llvm as a backend
* rust tries to solve the memory issues commonly found in C bu enforcing a programming paradigm which allow the compiler to detect them at compile time.
* it tries to provide 0 cost abstractions
rxv64 is a pedagogical operating system written in Rust that targets multiprocessor x86_64 machines. It is a reimplementation of the xv6 operating system from MIT.