I’m currently working on a product using rust on stm32 hardware. I’m very thankful to be using rust instead of c/c++ for a variety of reasons. That said, so far the big pain points so far have been:
Each HAL implements different things, or the same thing in different ways. I’m in the middle of switching from the stm32f1 i started on to the more capable stm32f4 and it’s been a painful switch. I assume this would also suck in other languages, but it seems fixable in rust. There’s also widely varied support for other MCUs
Lack of emulation. You can emulate an arm cpu in qemu, but there’s no tools for mocking out hardware - which means testing has to happen on device (or you need to fragment your application into qemu-testable pieces)
Shared memory / global initialization is overly complex. If you have a resource (say a gpio pin) that you want to set up in main but use in an interrupt and nowhere else, you have to use like five layers of abstraction to “safely” do that.
You are going to be very interested in Hubris, our de novo Rust operating system for MCUs.[0] We have been doing most of our development on STM32H7s, but Hubris works on both the F3 and F4 as well. We'll be open sourcing it in advance of my colleague Cliff Biffle's presentation at the Open Source Firmware Conference[1]; stay tuned!
I've googled "Pock OS" in every configuration I can think of, and can't find any relevant results. Closest I can find is "POK kernel"... is this a typo or am I googling wrong?
The plan is to open source the repository on github roughly at the same time as the talk, so basically, you can set your calendar to that and then go check it out.
> Each HAL implements different things, or the same thing in different ways.
They are generated programmatically from core description files that are only available to STM. They are designed sloppily because a lot of the implementation work is automated. To really progress in this area FOSS developers are going to need to get ahold of these interface specifications.
> Lack of emulation. You can emulate an arm cpu in qemu, but there’s no tools for mocking out hardware - which means testing has to happen on device (or you need to fragment your application into qemu-testable pieces).
This should be OK and I almost prefer it. Get a JTAG or SWD adapter, learn how to use GDB and the self test hardware, and you'll be fine. Modern MCUs are designed for test and can be used in circuit just fine. You can control any part of the chip via the trace macrocell.
> Shared memory / global initialization is overly complex.
> This should be OK and I almost prefer it. Get a JTAG or SWD adapter, learn how to use GDB and the self test hardware
I'd really like to write a test that e.g. confirms that if PB13 goes high for longer than 10ms, and event is triggered. Yes, I can run that code on my MCU - but actually testing it requires pressing buttons. With an emulator, I could easily write code to mock out the behavior of external hardware and confirm that my code is handling that input correctly.
> Related to my first point
I'm actually talking about a constraint of Rust - not the HAL or PAC.
static G_LED: Mutex<RefCell<Option<LEDPIN>>> = Mutex::new(RefCell::new(None));
fn main() {
// get the gpio and pass off to mutex
let mut led = gpioc.pc13.into_push_pull_output(&mut gpioc.crh);
// Move the pin into our global storage
cortex_m::interrupt::free(|cs| *G_LED.borrow(cs).borrow_mut() = Some(led));
}
fn interrupt() {
static mut LED: Option<LEDPIN> = None;
// load the led from the global mutex
let led = LED.get_or_insert_with(|| {
cortex_m::interrupt::free(|cs| {
// Move LED pin here, leaving a None in its place
G_LED.borrow(cs).replace(None).unwrap()
})
});
// actually use the led gpio here
}
If we were genuinely using G_LED from main and from the interrupt, the mutex song-and-dance makes sense, since we want to be careful about how we share use of this resource. But since we just want to set it up in main and use it in the interrupt it's a LOT of confusing overhead/abstraction.
Well, with Rust tooling that automates the creation of crates from these 'core description files' (aka CMSIS SVD, System View Description), there is a large interest from other CPU vendors too so SVD seems to have become a de facto standard.
For example, Espressif has SVD definitions for their CPUs (e.g. ESP32), which are definitely not ARM: https://github.com/espressif/svd, etc.
It's the PACs (peripheral access crates) that are generated from vendor supplied SVD files, including many patches. These are register definitions
Most HALs are actually hand written, often leveraging macros to implement similar functionality for multiple peripheral instances
Can you recommend any resources on using the self test HW (assuming JTAG/SWD/GDB are all setup, although I'd love to see any recommendations on those as well)? I currently don't know anything about the trace macrocell.
Check the documentation for GDB and OpenOCD and the overviews from STMicro. You can poke things into memory, like peripheral configurations into peripheral registers. You can have the trace cell generate events on the debug port based on memory access or other events, analogous to Intel/AMD performance counters and iperf/xperf. It also implements breakpoints and code stepping in hardware.
To do full mockups you usually need an IO board that can drive pins, but you can also have the trace macrocell jump to code that sets an IO pin's internal driver high and triggers the input side of the IO port.
Do you mean the register-level description generated from SVD? I thought one works mainly with the higher-level HAL traits, or is that not yet possible?
You would use the traits from embedded-hal to write e.g. a driver for some IC or when implementing functionality in a device agnostic way and then use (or write) a specific HAL driver implementing these
It's great that embedded-hal exists, but somehow the DMA drivers in the stm32f4xx_hal and stm32f1xx_hal crates have a pretty radically different api, which to me means something's a little broken.
One issue with STM devices is the shear amount of peripheral implementations, e.g. at least 3 USART implementations over the years, with all kinds of small differences. Pair this with inconsistent SVD files and you are halfway to the current HAL situation.
It's intended for async usage, but take a look at embassy-rs (not yet on crates right now)
The embassy contributors put a ton of work into unifying the pac for different versions of the peripherals and called it metapac
The F1 series is the original STM32 and it has notable hardware differences with the later family members that show up in the HAL. While the API has its sore spots there really is no good way for them to paper over all the differences between these variants. Avoid selecting F1 if you need optimal forward and cross compatibility. I'd also point out that the newer LL driver code is the "fixed" HAL wherever that's available.
i'm not a Rust-er but I hope this really takes off
after trying to switch to Basic, then Embedded Lua, then MicroPython, then TinyGo... I keep coming back to putting together microcontroller code in C/++, every time
if even one language can break the barrier, there's hope :)
Since I was mostly complaining, it might not sound like it - but I think that embedded Rust has arrived, for some MCUs. If there is good support for your chosen chip / dev board - there's no need to wait around for it to take off. You can even link C/C++ code if you need to use a library that doesn't have a rust equivalent.
Nim works rather well too on pretty much any MCU that can run C. I even tried Forth for a while myself. I’m glad Rust is making progress too, but it seems limited on MCU support.
You might want to take a look at the embassy project for a unified stm32 HAL. The idea is that it defines the APIs per peripheral version as defined by STM (spi v1, spi v2, usart v1 etc). The advantage is that a given peripheral can be used across all stm32 families with that version. This makes maintaining a HAL and keeping a unified API much simpler.
The other part is the stm32-metapac (not specific to async) that generates the PAC for any stm32 chip.
Type state programming with embedded Rust has stopped me from a number of bugs so far. The important thing for me is that these bugs were caught at compile time which makes a world of difference in terms of productivity. The biggest pain point is how different the libraries for different microcontrollers are. I feel over time they will converge as things stabilize, time will tell.
We're using rust on STM32F1 chips as part of the electronics for a liquid rocket engine that my university's SEDS chapter is developing. Rust makes this all so much easier and quicker to write.
Nice article, but I wanted to point out that you seem to imply that Rust "mitigate(s) or completely eliminate(s)" deadlocks and race conditions. Neither of these is true:
Perhaps you were thinking about data races, which are unsafe and therefore not allowed by Safe Rust? The Rustonomicon (https://doc.rust-lang.org/nomicon/races.html) defines data races as:
- two or more threads concurrently accessing a location of memory
- one or more of them is a write
- one or more of them is unsynchronized
Possibly you knew all of this and just got carried away in your enthusiasm for Rust (very much shared!), but lest your readers think Rust is actual magic instead of merely amazing, it might be nice to clarify the article.
Each HAL implements different things, or the same thing in different ways. I’m in the middle of switching from the stm32f1 i started on to the more capable stm32f4 and it’s been a painful switch. I assume this would also suck in other languages, but it seems fixable in rust. There’s also widely varied support for other MCUs
Lack of emulation. You can emulate an arm cpu in qemu, but there’s no tools for mocking out hardware - which means testing has to happen on device (or you need to fragment your application into qemu-testable pieces)
Shared memory / global initialization is overly complex. If you have a resource (say a gpio pin) that you want to set up in main but use in an interrupt and nowhere else, you have to use like five layers of abstraction to “safely” do that.