Learn to write a simple OS kernel with keyboard/screen support (2014)

anyfoo · on Jan 26, 2020

Like many people, I have started toy OSes, and like most such projects, all of them but my current one did not make it far past the booting stage.

The problem was that there was just too much to do: Booting, memory management, interrupt/exception handling, video output, keyboard, disc access, file systems, scheduling... most of those things are potentially interesting, but not all at once, and yet they are all pretty much necessary to get your OS to the "critical mass" where working on individual aspects starts being fun.

Until I had the idea to not just boot using plain old 16bit DOS, but too also keep DOS running in my OS, as a task. The thing about DOS is that it is barely an "operating system" in the modern sense. Instead, it is more a crude collection of 16bit x86 routines that do most of the aforementioned things: Console input/output, file system access, crude memory management, a whole shell actually (COMMAND.COM)... and you can even load drivers for networking, sound, and almost everything else, provided for form of it already existed about 15 years ago.

My current toy OS is a modern message passing microkernel, but for everything that I did not feel like implementing yet, I call into the DOS task. The task runs as a vm86 task, essentially within a hypervisor (called the "vm86 monitor"), but specifically for 16 bit real mode code. (64bit x86 got rid of the now obsolete vm86 mode, so there you would have to actually implement something much closer to a modern hypervisor.) This is done using using a shim layer that receives messages using the message passing system of my OS and translates them into the (usually) soft-interrupt driven DOS/BIOS function calls.

This is possible with DOS, unlike modern OSes, because it's such a crude and conceptually simple OS, centered around direct hardware access and with essentially no own memory protection/address space separation, that it just does not get in the way of implementing my own OS.

globular-toast · on Jan 26, 2020

Throughout my career I always had this niggling feeling that I don't really understand what's going on inside the operating system. Like many geeks I dislike magic tools and have to have at least a basic understanding of how I might implement it myself.

But I found that I was able to play with individual elements before having a system that's even remotely useful. I recommend the Xinu book. After only a few weeks' work (which was mostly research and reading, not programming) I had a system that could do a context switch. This was always the most magical part of an OS for me. After that I started to implement memory management, but it just got too hard to be fun (ARM64).

Rerarom · on Jan 26, 2020

So basically you do something like Windows 3.x?

anyfoo · on Jan 26, 2020

Or early OS/2, yap!

Santosh83 · on Jan 26, 2020

Just curious if someone has done OS dev in top down manner. That is, work within a modern multi-tasking OS like Windows or Linux and develop a collection of OS components (in the form of host OS modules?) that gradually replace functionality of the host OS? I imagine this would be much easier to do under Linux rather than Windows since the ABI/API for hooking into is open and documented. This way one can start developing just those components that interest you first, although I imagine key areas like memory management, scheduling and interrupts may not be replaceable from within a running OS?

pjmlp · on Jan 26, 2020

That is what rich language runtimes are all about.

When you do Common Lisp, Smalltalk, Java, .NET, Go, D, ... the underlying OS kind of loses its meaning.

The standard library can bind to OS primitives, or you can implement your hardware abstraction layer that allows for the language runtime to run bare metal.

If you squint a bit, UNIX is C's runtime, and gets everywhere via POSIX.

jstimpfle · on Jan 26, 2020

That's a misleading comment, and I can't find a valid point in there. You can run C on a different platform than UNIX, or "run it on bare metal", just as well. If you write C, you make reasonable abstractions, to achieve some amount of OS independency, just as well. And typical usage of "rich language runtimes" is on one of the mainstream OSes, just as well. You even often have some OS specific code in there, because the OS is not entirely interchangeable.

So, the OS does not lose more of its meaning if you move to rich language runtime environments. Those language environments are mostly about garbage collection. Most other things that make the OS "lose its meaning" are just libraries (that you can write in any language).

pjmlp · on Jan 26, 2020

Those language runtimes exist since the 60's, 10 years before C created the urban myth of being the first systems language, and many of them did not have anything to do with garbage collection.

Developing on Windows, and deploying across several UNIX flavours, mainframes and embedded targets has been pretty much OS independent for me during the last two decades, regardless how much OS specific code those runtime libraries might have inside of them.

Even if the CI/CD pipeline had to do OS specific builds.

jstimpfle · on Jan 26, 2020

> Those language runtimes exist since the 60's, 10 years before C created the urban myth of being the first systems language

I don't know what's your point here, or why you keep repeating this on HN hundreds of times. I didn't ever hear anyone claiming it was the "first systems language". Nor do I care.

(We all want a better language. I haven't tried many, but any other systems language I've seen I've found awkward, starting with Pascal with its mess of 1-based vs 0-based indexing, case insensitivity, weird mix of ref-counted / manually managed, weird syntax choices, mess of basic string types, requirement to create tons of typealiases... going to older languages that have upper-case only identifiers, for example).

As to rest - it's just a matter of libraries. Not runtimes.

EDIT: it seems some definitions of runtime include standard libraries. So, whatever.

bsenftner · on Jan 26, 2020

If one is writing cross platform enterprise software, any aspects of the underlying OSes get abstracted away and replaced with logic specific to your product's business cases. It is not unheard for an enterprise vendor to promote specific hardware, OS or other environment aspect because their software has been developed with knowledge of error and performance situations native to that configuration. In many ways, a mature enterprise application is software that has been written to include all popular hardware and software environments.

dredmorbius · on Jan 26, 2020

GNU: https://en.wikipedia.org/wiki/GNU_Project

d33 · on Jan 26, 2020

It's x86, isn't it? I'd love to see something as small as this one that wouldn't rely as much on BIOS.

The way I understand it, the following code is based one of the assumptions you can't make without x86 BIOS, can you?

    /* video memory begins at address 0xb8000 */
    char *vidptr = (char*)0xb8000;

jng · on Jan 26, 2020

That doesn't depend on BIOS, it depends on the standard "frame buffer" address for CGA/EGA/VGA graphics cards (I'm not 100% sure they all use the same address, those three and Hercules were the standard back in the day, and it seems my memory has started dropping nearly-useless information from 30 years ago, not a bad idea).

I quote-unquote frame buffer because it's text mode, one byte for the ASCII code (extended to 256 characters in configurable and often confusing and incompatible ways, MS had "codepages" and ISO has the equivalent Latin-1 through Latin-14 or so), and one byte for the attribute.

Many modern cards used to keep the basic VGA compatibility upon boot, until you switch them into more sophisticated modes. I'm not sure if they still do these days. QEMU, which is the actual target for the minimal demo kernel above, pretty sure supports it.

BIOS used INT 10h for video operations, from mode setting to printing characters, code that calls INT 10h is actually relying on BIOS. IIRC next to nobody used it except for mode setting, you could much more easily and flexibly write the right bytes yourself (with a little bit more work for compatibility). Plus you could use DOS INT 21h to actually print output strings with a bit more of a concept of a terminal.

emily-c · on Jan 26, 2020

You're still relying pretty heavily on the BIOS to write to this region that you otherwise wouldn't be. Getting the legacy framebuffer region set up to be decoded to the graphics card is using some hacks to get it to work over PCI[e]. The memory controller/PCI root complex needs to be configured to forward transactions to the the legacy A and B segments to the PCI bus using a proprietary bit to enable that MMIO hole. To enable correct decoding of that, all PCI bridges on the path to the graphics device need to be specially marked in their config space bridge control as being "VGA compatible" so that legacy framebuffer hole requests and VGA IO ports can ultimately be forwarded to the graphics device. This needs to exist because the root ports/host bridges in the system base and limit need to encompass the addresses of their children as well as needing to exist in a hole that the memory controller/root complex knows to forward requests down to PCI.

jng · on Jan 26, 2020

Thanks for the very complete explanation. It makes sense that this would work this way.

d33 · on Jan 26, 2020

My bad then, thanks for clarifying. Anyway, would it still work on x86_64?

jng · on Jan 26, 2020

Maybe emily-c knows the details. If the processor starts in real 16-bit mode these days like it used to do in the past (8086-compatible instruction decoding, flat mapping form virtual to physical memory, protected mode features mostly disabled, *16 segment registers, etc), there is no reason there should be a difference between 32-bit and 64-bit CPUs. But who knows, these things can get pretty hairy, and a lot of it is quite arbitrary, so you need to research the boot process of a modern x86_64 CPU.

emily-c · on Jan 26, 2020

It is still possible but is BIOS dependent :) With UEFI, graphics is preferred to be done with the GOP driver/protocol (which will ultimately interact with video BIOS/graphics card option ROM for you). During early boot setting up the VGA decoding can either not be done or disabled when booting with UEFI. You might need your UEFI BIOS to be in CSM mode (legacy BIOS compatibility mode for UEFI) to get this to work and a lot of modern systems don't support CSM as of late. You can write to the framebuffer using GOP runtime services, and the VGA IO ports would likely do nothing since they wouldn't be decoded to the graphics card. Full fledged OSes just load a graphics driver and directly talk to the hardware -- something that is unfortunately not wieldy for hobby OSes :(

akkartik · on Jan 26, 2020

Is there an 'Awesome' list of simple OS kernels somewhere?

eadan · on Jan 26, 2020

It's far from comprehensive, but here are a handful of operating system projects that have caught my eye:

[1] MezzanoOS, an OS written in Common Lisp (https://github.com/froggey/Mezzano)

[2] SerenityOS, a Unix-like OS (https://github.com/SerenityOS/serenity/). The primary author has a youtube channel (https://www.youtube.com/c/AndreasKling/)

[3] RedoxOS, a microkernel OS written in Rust (https://www.redox-os.org/)

[4] Nebulet, a microkernel based on webassembly (https://github.com/nebulet/nebulet)

[5] It was never released by Microsoft, but Midori OS had some fascinating ideas like software isolated processes, asynchronous message passing and object capabilities. Nice write-up about the project by Joe Duffy here http://joeduffyblog.com/2015/11/03/blogging-about-midori/

pjmlp · on Jan 26, 2020

As addendum Midori was used by Bing in production.

> While never reaching commercial release, at one time Midori powered all of Microsoft’s natural language search service for the West Coast and Asia.

https://www.microsoft.com/en-us/research/project/singularity...

And although not quite the same, at least Singularity's code is available.

The WinRT .NET 8.x compiler was based on Singularity's way of compiling[1][2] and UWP .NET Native kind of had some learnings from Midori [3].

[1] - https://channel9.msdn.com/Shows/Going+Deep/Mani-Ramaswamy-an...

[2] - https://channel9.msdn.com/Events/Build/2012/3-005

[3] - https://channel9.msdn.com/Shows/Going+Deep/Inside-NET-Native

pjmlp · on Jan 26, 2020

Here are some other ones,

Oberon System - http://www.projectoberon.com/

COSMOS - https://www.gocosmos.org/

JNode - https://www.jnode.org/

TamaGo - https://github.com/f-secure-foundry/tamago-go

A2/Bluebottle - http://cas.inf.ethz.ch/projects/a2

deaddodo · on Jan 26, 2020

The osdev wiki has a barebones example for a few languages. However, they're all x86 centered and literally just get you booted to a main entrypoint. Which, arguably, is about as far as they should since osdev takes quite a bit of dedication and research.