Let’s write a simple Kernel (2014)

exDM69 · on Oct 20, 2017

Bare metal projects are always a fun hobby. I've done some projects like this in the past [0], let me share some things I learned the hard way.

If the author is reading, here are a few corrections.

> char * vidptr = (char * )0xb8000;

This needs "volatile" qualifier or all the writes to video memory will be dropped when optimization is enabled.

Memory mapped I/O like this is one of the few use cases where you need "volatile" in C.

> gcc -m32 -c kernel.c -o kc.o

You need to build a cross-compiler for bare metal projects. Using the system gcc will not work in the long run. The compiler packaged with your operating system is intended for building binaries for use with your operating system. It may have downstream patches or configured to a target in a way that will cause problems.

The worst part is that it may seem to work for a while, until it doesn't. I ran into some hard-to-debug issues with my past projects and learned this the hard way. If I recall correctly, the issues were related to redzones and ABI conventions.

You need to build GNU binutils and GCC for the "i686-pc-elf" target. I documented this process for someone else's bare metal project here [1].

It really pays off to do this right from the start. Once you have a cross-compiler and a build system that can produce debuggable elf images (you also need to build gdb for the target), things get much easier. Using the built-in debuggers in QEMU or Bochs can only get you so far. Having a proper debugger with symbols and source view will make working much easier.

For a build system, plain old Makefiles work best, CMake and other high level build systems are a pain in the ass with bare metal projects that require linker scripts, etc.

[0] https://github.com/rikusalminen/danjeros [1] https://github.com/Overv/MineAssemble/blob/master/README.md

pgeorgi · on Oct 20, 2017

> You need to build GNU binutils and GCC for the "i686-pc-elf" target.

coreboot has similar issues, and so we maintain our cross compiler building script for 10 years now[0] (although historically using i386-elf instead of i686-*-elf). I sometimes wonder if there's value in packaging our compilers (all 8 architectures) for other users.

[0] https://review.coreboot.org/cgit/coreboot.git/tree/util/cros...

bogomipz · on Oct 20, 2017

>"You need to build a cross-compiler for bare metal projects."

Might you have any useful links or documentation on this that you could share? Thanks.

bee_vik · on Oct 20, 2017

I created a docker container that hosts the gcc cross-compiler I used to build my own OS:

https://hub.docker.com/r/brett/gcc-cross-x86_64-elf/

It hosts the gcc 7.1 x86_64-elf compiler with redzone disabled.

If you don't want to use the docker container, you can view the dockerfile to see the steps required to build the cross-compiler yourself:

https://github.com/beevik/docker/tree/master/gcc-cross-x86_6...

mw6621 · on Oct 20, 2017

I found this a while back, it works great for compiling a cross compiler for Linux and Windows.

https://github.com/lordmilko/i686-elf-tools

There's also a bit of information on OSDEV if you are wondering why you need a cross compiler:

http://wiki.osdev.org/GCC_Cross-Compiler

carussell · on Oct 20, 2017

FWIW, the need to bootstrap a cross-compiler from source is not inherent to cross-compilers; it's possible to implement a compiler in such a way that every version of the compiler is automatically a cross-compiler.

I remember a few years back when I was trying to play with MINIX. Tanenbaum got several million EUR and hired some grad students to work on the thing. They promptly replaced much of the system with NetBSD. ("Perhaps too much", you can hear Tanenbaum say in one of his talks.) As a result of this the system compiler ACK was switched out for LLVM/clang. I complained on the mailing list about this because a full system build from source is something that used to be doable in <10 minutes—something Tanenbaum used to boast about—and it was now taking 3 hours if you decided to blow away your source/build directory and do a from-scratch build. The worst part is that the MINIX core, i.e., all the interesting parts, still only accounted for ~10 minutes of that build time, and virtually all the rest was spent compiling and then recompiling LLVM. The response I got from one of the aforementioned grad students was that this is "just how cross-compilers work". No, pal; that's how the compiler that you chose works.

Later, the Go folks fixed their compiler to be a cross-compiler by default. See https://dave.cheney.net/2015/03/03/cross-compilation-just-go...

IMO, it's unforgivable that any given mainstream toolset wouldn't make this a baseline project goal.

mveety · on Oct 21, 2017

The Plan 9 compilers work exactly like this, every compile is a cross-compile. It's extremely convenient because you can do all of your builds on your really fast machine for all of your other machines or if you're using a slow machine. It's one of the reasons I can live happily with a raspberry pi as one of my main workstations. All you need to do to build for a different target is change $objtype.

exDM69 · on Oct 21, 2017

Afaik LLVM/Clang doesn't need bootstrapping from source for cross-compiling, but you can choose to include/exclude target support at build time. GCC and binutils can only have one target.

Was the LLVM/Clang build done for building a cross compiler to build Minix or was it to build a compiler that runs on Minix? The latter would be unavoidable, but for the former just having LLVM pre-installed on the build machine should do.

bogomipz · on Oct 20, 2017

>"I remember a few years back when I was trying to play with MINIX. Tanenbaum got several million EUR and hired some grad students to work on the thing."

This piqued my interest, what was this project exactly? Can you elaborate on this anecdote?

delta1 · on Oct 20, 2017

Great post. Thanks for sharing!

justbaker · on Oct 20, 2017

Comments like this are why I still use HN. Thanks a bunch!

m00dy · on Oct 20, 2017

what is a redzone in your terminology ?

exDM69 · on Oct 20, 2017

You know what, I'm not actually sure. All I know is that it screws up your stack frames. I think I had to provide -mno-redzone on GCC command line.

It's something in the ABI (I was on x86_64, the ABI is rather complex) related to padding the stack frames when calling functions.

I don't remember if this applied to all functions or was it something special needed for interrupt handlers.

gpderetta · on Oct 20, 2017

the unix x86_64 userspace abi guarantees that the 128 bytes (known as the red zone) above the stack pointer are guaranteed not to be clobbered by signal handlers. This an optimization that avoid the need for leaf functions to manipulate the stack pointer to allocate their stack frame.

The linux kernel does not follow this part of the ABI, so the compiler needs to be told that there is no red zone or functions will randomly see their stack frame clobbered by interrupt handlers.

DSMan195276 · on Oct 20, 2017

> The linux kernel does not follow this part of the ABI, so the compiler needs to be told that there is no red zone or functions will randomly see their stack frame clobbered by interrupt handlers.

You're absolute right, but I would add a qualifier that it is impossible for a kernel to use the redzone optimization. Certain architectures (Like x86) will push data onto the stack when an interrupt happens and there is no way to tell them to put that data after the first 128 bytes on the stack.

convolvatron · on Oct 20, 2017

you could use some register other than %esp for your stack :)

throwaway613834 · on Oct 20, 2017

Doesn't red-zoning break the fundamental contract of a stack pointer -- i.e., that it marks the end of the stack (and hence the location of the next item to be pushed)? This feature sounds like a bug to me...

DSMan195276 · on Oct 20, 2017

In some ways yes, but it is only applied to functions where they call no other functions and thus "nobody" will ever know they never moved the stack pointer anyway. In those situations moving the stack pointer is just pointless extra work because it is a value that will never be used and then just restored to its original value.

I say "nobody" because in particular signal handlers (Which are basically just userland interrupts) can interrupt any part of userland code. However since they are generated entirely by the kernel it is easy enough for the kernel to obey the redzone and place the signal handler 128 bytes above the stack pointer, so in practice this isn't that big of a deal.

steveklabnik · on Oct 20, 2017

http://wiki.osdev.org/Libgcc_without_red_zone#What_is_the_.2...

d99kris · on Oct 20, 2017

My favourite kernel development tutorial is bkerndev [0]. It provides easy-to-read sources that can be used as a good base for simple projects.

Shameless plug: I used bkerndev verbatim for my bare metal project - Nope OS [1] - a C64-like system that I built for my son when he was born, so that he could get to know computers the same way I did. :)

[0] http://www.osdever.net/bkerndev/Docs/intro.htm

[1] https://github.com/d99kris/nopeos

magnat · on Oct 20, 2017

It's not much of a kernel. More like a self-contained bare-metal application. Main goal of a kernel is to provide services to applications.

Although starting coding right away and getting tangible results is somewhat rewarding, it would be nice to see core concepts of designing a kernel mentioned in a "Kernel 101" article:

* How to choose between monolithic/micro/nano architecture?

* What are primary goals and use cases?

* What hardware should it abstract away in HAL?

* How to design userland-kernel and drivers-kernel API/ABI?

* How much isolation is needed and what are ways to provide it?

bogomipz · on Oct 20, 2017

I had a question about some of these passages, as they appear contradictory and or incorrect.

>"Most registers of the x86 CPU have well defined values after power-on. The Instruction Pointer (EIP) register holds the memory address for the instruction being executed by the processor. EIP is hardcoded to the value 0xFFFFFFF0. Thus, the x86 CPU is hardwired to begin execution at the physical address 0xFFFFFFF0. It is in fact, the last 16 bytes of the 32-bit address space. This memory address is called reset vector.

This says that the EIP is doing a JMP to an address in RAM not a memory-mapped IO address which points to ROM where the BIOS is stored.

>"Now, the chipset’s memory map makes sure that 0xFFFFFFF0 is mapped to a certain part of the BIOS, not to the RAM. Meanwhile, the BIOS copies itself to the RAM for faster access. This is called shadowing. The address 0xFFFFFFF0 will contain just a jump instruction to the address in memory where BIOS has copied itself."

The second says that the CPU is doing a JMP to an IO mapped-memory address which points to ROM.

So the first passage says that CPU is just doing a JMP to 0xFFFFFFF0 in RAM. The second passage say that the CPU is doing a JMP to memory-mapped IO address which points to ROM.

Are these not completely contradictory or am I reading this wrong?

It's always been my understanding that that reset vector always pointed to a memory-mapped address which was located in ROM. Since the BIOS' POST routines contain code to initialiZe and test memory(the BIOs will actually emit beep codes if no memory is present or memory is faulty.), this would be a chicken and egg problem.

Also the post mentions the chipset loads the BIOS into RAM as a process called "Shadowing" which was done because ROM used to be slow. But since BIOS ROM these days in generally NAND flash I don't believe this is the case any longer.

Also these two statements appear to contradict each other:

>"All x86 processors begin in a simplistic 16-bit mode called real mode. The GRUB bootloader makes the switch to 32-bit protected mode by setting the lowest bit of CR0 register to 1. Thus the kernel loads in 32-bit protected mode."

>"Do note that in case of linux kernel, GRUB detects linux boot protocol and loads linux kernel in real mode. Linux kernel itself makes the switch to protected mode."

The first states that the kernel loads in protected mode and then the following states that kernel loads in real mode.

gmueckl · on Oct 20, 2017

Grub itself is mostly written as a 32 bit program (almost it's own OS by now, actually). The x86 Linux kernel does expect to perform the protected mode switch on its own in the early assembly code right before the zImage decompression starts. So Grub has to switch back to real mode in order to jump into this part of the Linux kernel.

Grub has other ways to boot kernels. I once used a method that looks for a specific signature inside the kernel blob that would tell Grub to stay in 32 bit mode, put the blob at a certain memory address and jump into it. This was a feature of Grub 1 back in the day. I do not know if it was removed in Grub 2.

bogomipz · on Oct 20, 2017

Thanks, yeah I guess I hadn't really thought about who flipped the protected mode bit Grub or the kernel before. It's documented here in the section "3.2 Machine state"

>"When the boot loader invokes the 32-bit operating system, the machine must have the following state ... ‘CR0’

Bit 31 (PG) must be cleared. Bit 0 (PE) must be set. Other bits are all undefined."[1]

However it sounds like Grub also understand the Multiboot Specification so can keep protected mode set if it's booting a kernel that expects it to already be set. It's not clear to me how Grub would determine that.

http://www.gnu.org/software/grub/manual/multiboot/multiboot....

arkj · on Oct 20, 2017

The reset vector must be a mapped ROM address otherwise how will the jmp instruction be loaded. But I also seen in some places where they say it can be configured to DRAM as well [0]. I am not sure how it will work then.

[0] (page 5)- https://firmware.intel.com/sites/default/files/resources/A_T...

bogomipz · on Oct 20, 2017

Right, and also since that IO memory-mapped address points to ROM whose backing store is likely NAND flash these days there should also be no reason to shadow the BIOS in RAM as the article state it does no?

Also thanks for the link.

billpg · on Oct 20, 2017

Decades ago, I wrote a very simple operating system. It was an MSDOS EXE that when run, took over all available memory and rewrote all the interrupt vectors. You couldn't "exit" back to DOS because by then it had gone - you could only power-cycle the machine and let MSDOS boot up normally.

Because it launched from DOS, many insisted to me it wasn't really an operating system. What was it if it wasn't an operating system?

gmueckl · on Oct 20, 2017

Novell Netware actually used DOS as a glorified boot loader. Also, there was at a point a small DOS program that acted as a Linux boot loader. It brought you from a DOS prompt straight to a Linux virtual terminal without reset. I think SuSE shipped it for a while as an alternative means to launch their setup, but memory fades.

bri3d · on Oct 20, 2017

loadlin.exe! It ran under memory-unsafe Windows (95/98/Me) as well.

jacquesm · on Oct 20, 2017

An operating system exports abstract facilities such as hardware, storage and other useful bits to other software we call applications. How you get there doesn't matter, what you do once you are there makes the difference.

So I guess both you and your friends were wrong.

mmphosis · on Oct 25, 2017

According to the File Manager the linux kernel is a DOS/Windows program:

  Name: vmlinuz-*
  Kind: DOS/Windows executable

gpderetta · on Oct 20, 2017

You could think of MSDOS as a glorified bootloader...

laumars · on Oct 20, 2017

To be fair, Windows used MS DOS as a bootloader for decades.

a-priori · on Oct 20, 2017

Only the consumer line that ended at Windows ME.

The Windows NT line, which subsumed the consumer line starting at Windows 2000, always used its own bootloader. So has Windows CE.

laumars · on Oct 20, 2017

Well yes, clearly. That's why I used past tense rather than saying "still uses" and why I didn't mention NT nor CE (I also didn't mention Microsoft BASIC nor Xenix - just so you know they didn't use DOS as a bootloader either :P). I figured the context was obvious given the audience of HN, but clearly not.

Anyhow, my point was just to reinforce the GP's point that it's ok to think about DOS as a bootloader because Microsoft have done exactly this themselves.

bogomipz · on Oct 20, 2017

Sure that's what the splash screen was actually hiding right?

laumars · on Oct 20, 2017

The splash screen would start when win.exe (or was it win.com? I forget now) started. So you'd catch a glimpse of DOS before the splash. This was back in the 9x era though. Back in the days of 3.x I think it was still pretty common to boot to DOS and then start Windows manually.

isostatic · on Oct 21, 2017

Yes in dos/win3.1 you might have win.exe at the end of autoexec.bat, you might have dosshell in there. You might have some Novell "login.exe" instead. Or maybe it was good old command.com.

Win95 had a full screen splash which was an RLE file called logo.sys, loaded by the win95 boot process. I don't think autoexec even ran under 95/98/me. I'm sure that Wikipedia will have the full history though.

laumars · on Oct 22, 2017

Autoexec did run on Win95. I remember spending hours tweaking it back in those days. Though I think it was mainly used for DOS programs and maybe some environmental variables (eg for JAVA).

I remember writing my own logo.sys boot splashes. If I recall correctly they were just bitmaps with a couple of values altered in a hex editor to define which sequence of colours (in the colour palette) to animate.

drngdds · on Oct 20, 2017

"This programming tutorial is great, but it could really use some giant meme images," said no one ever.

visarga · on Oct 20, 2017

Oh, that kind of kernel. There are many kernels - GPU kernels, SVM kernels, etc.

_ooqq · on Oct 20, 2017

...what Colonel?

varunagrawal · on Oct 20, 2017

Am I the only one who's pretty much done with reading these sort of rinse and repeat articles about just writing a bootloader in the name of a kernel?

vidarh · on Oct 20, 2017

Nobody is forcing you.

I wish we had more of these, not least because of the first comment: "great post. Didn't know it was this easy ;)"

People assume a lot of low level stuff like this is near magic not for mere mortals, and it's a great shame, because there's a lot of this kind of low level stuff that more people would benefit from playing with.

And because demystifying it is a great path to get people into kernel hacking even if they don't end up writing their own complete kernel.

gmueckl · on Oct 20, 2017

This stuff is not magic. But it requires a lot of reading and hunting for proper documentation on a platform as convoluted as x86. An ARM board would probably be way easier to play with. I never looked at the Pi. How well does it lend itself to this type of hacking?

vidarh · on Oct 21, 2017

Specific ARM boards might very well be easier, in that a lot of them are clean slate designs, so you're not dragging along nearly 40 years of PC history. But ARM CPU's themselves is plagued by the fact that OEMs can put pretty much whatever they want in there, so there is a proliferation of really weird beasts.

Tepix · on Oct 20, 2017

The article uses the grub bootloader so that part is missing. But I was expecting more when I read "kernel" as well.

shakna · on Oct 20, 2017

You could jump into a project like MikeOS, which has guides for modifying and extending [0], but the know-how from a bootloader should be enough to get you going.

[0] http://mikeos.sourceforge.net/handbook-sysdev.html

drwu · on Oct 20, 2017

I share the same feeling. I've done even a better boot loader which could load a simple shell on the floppy when I was 13. However, only in 16 bits and use bios INTs.

fredrb · on Oct 20, 2017

A simple Kernel shared with the community is better than a full-featured one kept to yourself.