“It takes about 2 hours to boot to bash prompt ("init=/bin/bash" kernel command line). Then 4 more hours to boot up the entire Ubuntu ("exec init" and then login). Starting X takes a lot longer.”
It's this slow not because of the MCU, though, but because the author is emulating another architecture in software. It would be fairly sluggish on a PC too.
This is both very impressive in terms of the sheer amount of work that went into that, but also a tiny bit disappointing because it's not really Linux running natively on an 8-bit chip.
I suspect that native Linux is technically doable and would be a lot faster, but would require a lot of fiddling with the kernel, to the point where you'd be rewriting as much as you're keeping. A lot of assembly shims to port, gotta get rid of all the MMU code, etc. AVR chips in particular also also a bit weird because they follow the Harvard architecture - separate addressing for code and data, which is not something the Linux kernel is made for.
FWIW, modern 8-bit MCUs are actually pretty darn fast. They have clocks all the way to 50 MHz or so, single-cycle execution for most opcodes, some have DMA and other cool gizmos on the die. They are orders of magnitude faster than the 8-bit tech we had back in the 1980s.
That's basically just NuttX. Along similar lines is Zephyr, which is actually managed by the Linux foundation and broadly follows kernel conventions where reasonable.
It's really hard to meaningfully compare. They're designed for other things.
Most desktop or server CPUs prioritize massive parallelism, which is useful for OSes and multithreaded apps. In contrast, most MCUs have just one single-threaded core. CPUs are expected to run with sophisticated and bulky active cooling, so they reach speeds up to 3-5 GHz; in contrast, MCUs are almost always used without any added cooling and need to be power-efficient, so they seldom venture above 500 MHz. CPUs require external power controllers for dynamic voltage scaling to realize that performance. MCUs are often expected to run off a single supply, often something like "anywhere between 1.8 and 3.3 V". CPUs have a variety of hardware accelerators for operating on gobs of data (e.g., AVX vector instructions). In contrast, the most an MCU is typically expected to do is handle some low-resolution camera, and only some higher-end models have a hardware floating point unit.
Most benchmarks are optimized for desktop tasks too, so you can expect MCUs scores to be two orders of magnitude lower. But that doesn't mean they perform that much worse at the tasks they're intended for.
In situations where you need to do some real-time ML-based image classification, drive a 4K display, and stream H.265 videos, you generally don't reach for a traditional MCU, but for a CPU, with all the extra power supply complexity and thermal management issues this entails.
SoCs blur the line somewhat, because they often combine a CPU core, a graphics card, lots of memory, and a bunch of other things in a single package - making them essentially a fully-fledged computer that can run Linux, but is about as easy to integrate as an MCU. These are still slower than top-of-the-line desktops, but they're in the "one order of magnitude" territory.
Do people know about new RISC-V based boards to experiment with? The other day I found Milk-V Duo [0], a $9 computer that runs Linux, and watched a cool video demonstration [1]. Another interesting board is the BeagleV®-Ahead [2], but there must be many more such open source HW/SW projects.
Some kind of Raspberri Pi style desktop replacement (or capable of that). GPU, GB's of RAM, plug in USB peripherals, fast external storage, ...
Microcontroller: on the lower extreme of CPU power, no graphics, small flash/RAM, grab soldering iron to hook up peripherals. There's ~$0.10 parts there.
Boards like the Milk-V Duo fall in between. uC like, faster cpu, larger embedded RAM, may produce video output, not suitable to run modern desktop OS.
Very different beasts for very different applications.
It's a pretty interesting question, even if a bit tangential to the article!
Their main advantage is the comparatively low transistor count. The chips can be made cheaply and don't need cutting-edge fabs. This also makes them exceptionally power-efficient and tolerant of a wide range of supply voltages, so they're easier to integrate into many designs.
Just as important, they are far easier to master. Single-cycle instruction execution, single chip-wide clock, simple interfaces to all peripherals. If you don't have an OS to abstract it all away, most of the contemporary 32-bit platforms are a serious pain. Adjusting clock speed on a typical Cortex-M chip requires following a carefully-choreographed sequence of steps, possibly fiddling with flash wait states, various bus clocks, PLLs, and so on. On an 8-bit MCU, you just flip a bit or two.
It's a weird misconception that "8-bit" means antiquated or slow. Modern 8-bit MCUs run at high speeds, have plenty of fast memory, and integrate all kinds of cool on-die peripherals - DACs, ADCs, op-amps, DMA controllers, USB controllers, and so on. And in a good majority of embedded applications, they are enough.
Of course, there's nothing sacred about 8-bit ALUs, so I wouldn't be surprised if 32-bit ALUs eventually take hold in this market segment. But the replacements would need to preserve that "I can learn this in a weekend" property that isn't preserved in most of the high-performance 32-bit chips available today.
> This also makes them exceptionally power-efficient ...
> Single-cycle instruction execution
For what it's worth, single cycle execution is really a detriment to power efficiency. There is an optimal logic level per pipeline stage for CMOS processes which will minimize power (and it's usually quite small!). Extremely deep pipelining lets you hit a higher Fmax, but if you don't crank up the frequency you will often save quite a bit of power.
I'm curious, and knowing nothing about the determinants of power draw and node process: wouldn't a "nearer-to-cutting-edge" fab process, though, still end up producing processors that draw even less energy? Or has that not been a goal of recent semiconductor progress?
The silicon that drives IO pins & oscillator(s) requires some physical size (for one, due to ESD protection reasons), and requires X amount of power.
A complex, modern cpu can consume many times that, so cutting edge process is useful to minimize overall power use.
With small/simple parts like a uC, not so much. So the cost/part, other considerations, and whatever vendor or silicon process provides that, is what matters.
Here, power savings come from sleep modes, disabled peripherals & (maybe) reduced or even stopped clocks. Not so much reduced feature sizes in the IC.
I appreciate your thorough answer, USB is pretty surprising, but just 256 byte address space, and that needs to take into account NULL (I assume?), and various hardware devices represented in the address space, I just don’t get how it’s possible.
As others are saying, "8-bit" doesn't mean 8-bit address space. It refers to the ALU "word" size instead - i.e., how many bits you can add, subtract, multiply, or divide in a single step. (Larger types are possible, you just need to do the operations in steps, just like 64-bit math on 32-bit platforms.)
Even in the 1980s, most 8-bit microcomputers had at least 64 kB memory. Today, many AVR chips will have 145 kB of memory (flash + SRAM + EEPROM), and pretty much arbitrarily more can be installed externally.
It’s worth mentioning that the linked article was written over a decade ago.
- Even today, 8-bit MCUs can be significantly cheaper than 32-bit MCUs. A lot of microcontrollers are used for very simple tasks that simply don’t need anything more than what an 8-bit MCU can do. Although, these days it looks like 16-bit MCUs are comparable or sometimes cheaper when looking at Mouser and DigiKey? RL78-S1 may or may not count as 16-bit. I’m not familiar with it, and seeing some conflicting info.
- Plenty of legacy designs certainly continue to get manufactured, since updating the design and software to support a more advanced/cheaper processor would both require requalification of the product and require engineering time.
- If you need the absolute lowest power consumption, a simpler, smaller, slower core can be helpful.
They're smaller (as in transistor count) and draw much less current than more powerful ones, which in some context (embedded/industrial, etc) is quite important. For simple tasks such as "read a sensor and if value isn' between values X and Y send this message over serial port and turn on relay on GPIO4" anything more than a smallish 8bit (sometimes even less) uC would be a huge waste of resources and potential source of problems since every piece of hardware or line of code added could hide a potential bug, therefore one adds only what is absolutely necessary, and that includes an OS and the beefier hardware required to run it.
You don't need a microcontroller to build a realtime system, it could work equally well on an ARM or RISC-V machine. Determinism is key, not feature set; however, you will want to avoid modern features such as dynamic frequency scaling or the x86 system management mode that operate outside of the control of the operating system.
fair, I left out what I really wanted to say. The simplicity of a microcontroller means it's hard to shoot yourself in the foot when building realtime systems.
I laughed at this - amazing work once again.