Very refreshing to see this. It is so much fucking easy to grok bare metal C com...

kennywinker · on Oct 6, 2021

Ah yes, nice easy to grok code like `curval &= ~(field_mask << shift);` :P

But for real - I’ve had way more luck grokking embedded rust than all of the bare metal C examples i’ve looked at. C breeds dense bittwiddling and code that relies on inscrutable compiler behavior. There are easier ways to learn how these systems work at a bare-metal level.

PaulDavisThe1st · on Oct 6, 2021

Would you like to propose or reference a way of doing bit-twiddling that is clearer than this?

Also hint: C doesn't breed bit-twiddling, writing software that actually interacts directly with hardware does.

Veserv · on Oct 6, 2021

They are just implementing a generic contiguous bitfield clear.

field_mask was probably constructed as ((1 << width) - 1) instead of as a manifest constant. So you can just do:

ClearBitField(input, width, shift) { return input & ~(((1 << width) - 1) << shift) }

Now you just use that everywhere you would clear a contiguous bitfield which is a pretty common operation when operating on hardware. Now all your bit-twiddling is isolated to a single well-defined generically useful function instead of repeating it a billion times.

We know this is a generically valuable operation since this is basically a C implementation of the ARMv8 bfi (b)it(f)ield (i)nsert instruction with a fixed 0 argument or in assembly:

BFI X{n}, XZR, #shift, #width

kennywinker · on Oct 6, 2021

I like this answer too. When opaque code is irreducibly opaque, put it in a fn with a well chosen name.

kennywinker · on Oct 6, 2021

That was just a throwaway example of a pretty write-only line of code from the op codebase, but since you asked:

One operation per line. A comment for every operation. Shifts that explicitly say if they are wrapping or overflowing. Rust uses ! instead of ~ but if I had my way it’d be a named function like bitwise_invert().

    // curval &= ~(field_mask << shift); // original line

    // pseudo-rust version
    let shifted_mask = FIELD_MASK.wrapping_shl(shift); // be clear about what kind of shift we’re doing
    let invered_mask = shifted_mask.bitwise_invert(); // use a fictional invert fn to avoid single-char operators.
    let shifted_val = curval & inverted_mask; // new variable instead of mutating the existing one

Ideally those comments would say WHY we’re doing those ops rather than what’s notable about them - but i didn’t dig into the code enough to write explanations.

And then we let the compiler crush that into an efficient lil one liner like the author of the original code did manually.

NobodyNada · on Oct 6, 2021

That's significantly less readable than the C version. I still have to know what a "left-shift" and "bitwise invert" are, and if I knew that then I wouldn't have a problem with `<<` or `~` either. IMO `<<` is even more intuitive than `shl` because I can just look at the arrow instead of having to think about which way "left" is (and I don't even have a tendency to get "left" and "right" confused).

All the extra verbosity simply obfuscates the actual intent of the code: clear all bits in field_mask (shifted to the left by some offset). That's pretty easy to see at-a-glance from the C code (some comments could make that clearer, but this is simple enough that any experienced systems programmer will know what this does without comments).

I agree that Rust embedded code is often more readable than C, but that's done by creating abstractions to manage complexity rather than just by adding more words. For instance, one could write a wrapper struct that provides a less-tedious interface than a bitfield (like `curval.set_field(false)`).

kennywinker · on Oct 6, 2021

>> All the extra verbosity simply obfuscates the actual intent of the code

I have a preference for verbosity in code, and I know that many people don't share my preference. That's alright - there's no exact right way to write that code. But my point was C encourages you to write code that relies on knowing secrets about specific hidden behavior in your compiler. `shl` isn't more clear than `<<`, but `wrapping_shl` and `overflowing_shl` ARE more clear, because it makes us explicitly aware of behavior that `<<` doesn't surface.

As for clarity, I agree an abstraction would be best. And Rust encourages those abstractions where C discourages them. I'd still argue that the inside of that abstraction should be the verbose version, but other than the wrapping_shl that's mostly just a style/preference thing.

NobodyNada · on Oct 6, 2021

In general, I'd agree with you -- I prefer spelling things out explicitly instead of using terse abbreviations. However, really common & fundamental math operations benefit from some shorthand. For instance, 'y = ax + b' is way easier to read than:

    let multiplied = a.wrapping_mul(x); 
    let y = multiplied.wrapping_add(b);

The "terse" equation I can instantly recognize as a linear function, while I'd have to stare at the more verbose version it for a while to figure out what it does. In my opinion, bitwise operators work the same way: if you're working in a domain where you have to write thousands of simple bitwise operations, a bit of shorthand can make the code much more expressive.

kennywinker · on Oct 8, 2021

This is a good point. Programming and math are different systems. `y=ax+b` doesn’t need clarification in math because wrapping, overflowing or saturating are not concepts that apply. My pragmatic solution would be a comment pointing out the math-formula, with the code represented in the explicit+verbose way. But i can imagine language constructs that allow the simpler expression without compromising on clarity. E.g. some kind of math scope that forces you to define which operators you’re using.

    fn calc_linear() -> int {
        math(x) -> y {
            let * = saturating_mul
            let + = wrapping_add
            y = a*x+b
        }

        return y
    }

There’s clearly a war between explicit and verbose going on here. Not sure the solution, but I firmly believe we can do better than C on this one.

adrian_b · on Oct 6, 2021

When booting a real CPU you might easily have to modify from a few tens to a few hundreds of hardware registers, by doing to each one or more such bit operations.

If you would choose such a deliberately verbose style, especially the splitting in multiple lines is the worst, the written code would become really unreadable, as too much space would be filled with text that does not provide any information, obscuring the important parts.

Normally the name of the register, the mask constant and the shift constant have informative names that should indicate all that needs to be known about the operation done and any other symbols should occupy as less space as possible on the line of code.

kennywinker · on Oct 6, 2021

That’s what functions and automatic compiler inlining are for. See verserv’s answer https://news.ycombinator.com/item?id=28776751

adrian_b · on Oct 6, 2021

No, using functions for such things is worse.

It does not matter if the compiler inlines them, encapsulating the bit field operations obfuscates the code instead of making it more easily understandable.

It is not possible to make the name of the function to provide more information than the triplet register name + bit field name (the name of the shift constant) + the name of the configuration option (the name of the mask constant).

Encapsulating the bit operations into a function just makes you write exactly the same thing twice and when you are reading the code you must waste extra time to check each function definition to see whether it does the right thing.

The C code would look just like a table with the names, where the operators just provide some delimiters in the table that occupy little space.

Replacing the operators with words makes such code less readable and concatenating the named constants into function names or using them as function arguments brings no improvement.

The only possible improvement over explicit bit operations is to define the registers as structures with bit-field members and use member assignment instead of bit string operations.

Unfortunately the number of register definitions for any CPU is huge, so most programmers use headers provided by the hardware vendor, as it would be too much work to rewrite them.

For almost all processors with which I have worked, the hardware vendor has preferred to provide names for mask constants and shift constants, instead of defining the registers as structures, even if the latter would have allowed more easy to read code.

kennywinker · on Oct 6, 2021

> you must waste extra time to check each function definition to see whether it does the right thing

I think I see what you're arguing. That this:

    reg1 &= ~(width_mask_1 << shift);
    reg2 &= ~(width_mask_2 << shift);
    reg3 &= ~(width_mask_3 << shift);
    // etc...

is clearer than something like this:

    reg1 = ClearBitField(reg1, 1, shift);
    reg2 = ClearBitField(reg2, 2, shift);
    reg3 = ClearBitField(reg3, 3, shift);
    // etc...

If that's what you're arguing, I simply don't agree. `ClearBitField` is descriptive and readable. It avoids creating all those width_mask_n constants, since you specify the width as input to the fn. You don't have to go digging into `ClearBitField` because you wrote a unit test to confirm that it does what it says on the label and handles the edge cases.

On top of that, the code inside `ClearBitField` can be as verbose or as compact as you desire, because it's contained and separated from the rest of the code.

fouric · on Oct 6, 2021

I find the first code example easier to read and process.

However, that's because I've written a fair bit of C code, and so when my brain goes into "C mode", the symbols &, =, ~, <<, etc. all have clear and unambiguous meanings - whereas ClearBitField does not. Additionally, the pattern ~(foo << bar) is a common C idiom, so beyond the individual symbols, my brain recognizes the whole pattern so it's "semantically compressed" (easier to think about) for me. This would not be the case for a beginner.

Which style is better depends on an individual's preferences and experiences - there's no "right" answer.

This is a stellar example of one of the many reasons why code-as-text is a huge mistake - because structure and representation are conflated and coupled together. A sanely written programming language represents code as code objects, and you can configure those code objects to be displayed however you like, whether that's baz &= ~(foo << bar) or ClearBitField(baz, 1, bar).

adrian_b · on Oct 6, 2021

Obviously this is a matter of personal preferences and experience.

Real register names are usually very long, to indicate their purpose, so you would not want to repeat them on each line.

This can be avoided by redefining ClearBitField.

Even so, writing an extra "ClearBitField" on each line does not provide any information. It just clutters the space.

Anyone working with such code is very aware that &=~ means clear bits and |= means set bits.

When reading the table of names, the repeated function name is just a distraction that is harder to overlook than the operators.

The way to improve over that is not adding anything on the lines, but using simpler symbols by defining the registers as structures, i.e.:

register_1 . bit_field_1 = constant_name_1;

register_2 . bit_field_2 = constant_name_2;

register_3 . bit_field_3 = constant_name_3;

Unfortunately, like I have said, the hardware vendors seldom provide header files with structure definitions for the registers and rewriting the headers is a huge work.

However, if you are able to rewrite just the register definitions that you use, that would be better spent time than attempting to write functions or macros for these tasks.

spacedcowboy · on Oct 7, 2021

To me, that C line is (a) crystal clear, and (b) elegantly describes what I want to happen in a concise way.

The three lines of rust, by comparison, are ugly, long-winded and far-and-away harder to grok.

That’s probably because I’ve been an embedded systems engineer, and if you think C is terse, you ought to try verilog… but then I wouldn’t have the temerity to suggest that something that works well for me is how everyone else should do it.

junon · on Oct 6, 2021

No thanks, I'll take the C version any day.

kennywinker · on Oct 6, 2021

Sure, the single line is more aesthetically pleasing. Compact, clever, concise. But try fixing a bug or adding new functionality to that one line. Especially as a beginner. This is supposed to be an educational codebase.

isometimes · on Oct 6, 2021

I've stated in part1 of the tutorial that "This tutorial is not intended to teach you how to code in assembly language or C".

My goal was to demonstrate some basic principles to get code running on bare metal, encourage curiosity, further my own knowledge and document my findings.

I appreciate that more self-documenting code might be desirable, but to some people (me included) a large number of lines can be as off-putting as more esoteric syntax. I acknowledge, however, that it is very hard to please everyone!

kennywinker · on Oct 6, 2021

Hi! Thank you for writing and publishing this project! Just to clarify: no part of my critique was aimed at you or your choices in this codebase. My main point is that unlike the original commenter in this thread, i believe that well written C is not as clear and simple as well written rust (or other modern languages). I then tried to back that up by cherry picking a random line of “c-like” code from your codebase. My beef is with C, not you or anybody else using it :)

isometimes · on Oct 7, 2021

That makes complete sense, and I don’t take it personally. It’s been so cool to read all this valuable feedback!

I must say that I’m very comfortable in C, only because it’s where I landed up as a kid. I actually find it way less confusing than more “modern” languages (I kinda skipped OO etc.!), and I enjoy the “control” it gives. Maybe you can’t teach an old dog new tricks after all! ;-)

userbinator · on Oct 7, 2021

Especially as a beginner. This is supposed to be an educational codebase.

...which means that it's serving its purpose well. If you're a beginner and it's hard to understand, that's absolutely normal. If it's easy, you probably already knew. That's how learning is supposed to work.