An opinionated and needlessly specific list. For example, this Use minimally two...

Kurtz79 · on Nov 4, 2016

"An opinionated and needlessly specific list."

Opinionated, maybe (although it seems to me most rules are reasoned and a justification is given).

But something practical has to be specific.

It has happened time and again to come across something it appeared to me as an arbitrary an unnecessary rule, but with more experience I think it's better to have some "debatable" specific rules that something "open to interpretation", especially if many people will have to work on the same codebase.

4ad · on Nov 4, 2016

> An opinionated and needlessly specific list

Not really. Gerard Holzmann is in charge of software at NASA/JPL. Each rule comes from experience with a mission loss that would have been prevented if people had implemented that rule. They also help static analysers become more accurate.

These rules are specific to ultra-reliable embedded software, flight software or software for life-critical equipment. They are not really relevant for most normal programming.

adrianN · on Nov 4, 2016

That doesn't mean that the rules are perfect. There haven't been that many mission losses to have a large corpus of data and finding a root cause of an accident might not lead to the best rules. "The rocket wouldn't have blown up if you asserted this here" doesn't mean that adding a large number assertions necessarily leads to better code.

erikbye · on Nov 4, 2016

A lot of wisdom comes from simulations as well.

huhtenberg · on Nov 4, 2016

Sure and this list likely worked well in the NASA/JBL context. The point being is it needlessly specific as a generic advice.

At the risk of stating the obvious, a "goto exit" pattern is the defacto standard for function cleanup in Linux kernel, which may not be flying every space probe, but it's still pretty damn ultra-reliable piece of software.

naasking · on Nov 4, 2016

> [Linux is] still pretty damn ultra-reliable piece of software.

It's really not. I don't think you have a good sense of how reliable the systems in this space really are. Firmwares and other ultra-reliable systems typically use microkernels, and there are billions of units of verified L4 microkernels out there. Linux can't even dream of approaching this level of reliability.

unethical_ban · on Nov 4, 2016

If I wanted to learn more about open source, reliable microkernels, where would you point me?

naasking · on Nov 4, 2016

seL4 is the big one: https://sel4.systems/

Here's a general reference of open source microkernel operating systems: http://www.microkernel.info/

Here's a general reference for open source real time operating systems: http://www.osrtos.com/

JoachimSchipper · on Nov 4, 2016

seL4 is really sexy, but I don't think it's anywhere near the most-used one?

nickpsecurity · on Nov 5, 2016

OKL4 if we're going with the phone example. They stopped open-sourcing it by version 4.0 but someone should be able to dig up OKL4 3.0 somewhere.

4ad · on Nov 4, 2016

Talk about missing the point and the whole context around mission-critical software.

> a "goto exit" pattern is the defacto standard for function cleanup in Linux kernel

There are many ways to do error cleanup that do not involve goto, and there are many bad ways to use goto. A static analyser can't decide if you do "good goto" or "bad goto", but it's easy to ban goto outright and avoid all the "bad gotos". You pay the price in terms of programmer work (he has to do something different), but you gain better static analysis and eliminate a full class of bugs.

This tradeoff makes sense for JPL and other people who do mission-critical of life-critical software. It might not make sense for the Linux kernel or most C software.

Mission-critical is not the place where you want to rely on programmers to do their work correctly (goto fail anyone?). You rely on specific methodology, standards, and static analysis.

nanolith · on Nov 4, 2016

> A static analyser can't decide if you do "good goto" or "bad goto", but it's easy to ban goto outright and avoid all the "bad gotos".

That depends on the goto statement. A good static analyzer can certainly recognize a "goto exit" pattern. There is very little difference in a function graph between an if/then/else clause and a goto statement. In fact, most static analyzers perform a first pass that is pretty similar to a compiler so that it can build a function graph for performing data analysis along each of the potential branches. One this pass is performed, it doesn't matter if you used an if/then/else, a for/while/do-while, a switch statement, or a complex sequence of break or continue statements. A graph is a graph.

The more succinct question is, "What constitutes a 'good' goto statement?" From the perspective of a static analyzer, it would be applying the exact same analysis as it would for any other branch in the function graph. If it detects a memory leak, a use-after-free, or a use of data which isn't always initialized, it makes little difference whether this is because of a bad goto statement or a sloppy set of conditionals. Either the graph shows a problem, or it does not.

OskarS · on Nov 4, 2016

The Linux kernel is not mission critical? Many of those mission critical systems probably run off of a Linux kernel, you know.

I think the point is that while it is potentially a good rule for mission critical code, it is not essential. There are other ways as well.

pjmlp · on Nov 4, 2016

No it isn't.

Operating systems like INTEGRITY RTOS are.

http://www.ghs.com/products/rtos/integrity.html

Which is why GNU/Linux gets packed into his own little sandbox running on top of INTEGRITY.

kristoffer · on Nov 4, 2016

Well, SpaceX claims to be running their flight software on Linux so it is critical to their missions at least :)

4ad · on Nov 4, 2016

SpaceX uses VxWorks for their hard real-time requirements: http://blogs.windriver.com/vxworks/2010/12/vxworks-helping-c...

They use Linux for non-hard-real-time stuff. Their microcontrollers also do not use Linux.

Coincidentally, Tesla uses QNX.

kristoffer · on Nov 4, 2016

Yes for their microcontrollers they use VxWorks. But Linux is still used for the flight software (i.e. mission control).

4ad · on Nov 4, 2016

VxWorks does not run on microcontrollers.

kristoffer · on Nov 4, 2016

Don't know your definition of microcontroller but VxWorks runs on e.g. Cortex-M3/4, MIPS, 68K, SH, SparcV8, PowerPC thus including some devices I could microcontrollers. But there is no clear cut definition of microcontroller, microprocessor, SoC, whatnot.

4ad · on Nov 4, 2016

No, they don't. The Linux kernel is not a RTOS. Most JPL projects use VxWorks.

pjmlp · on Nov 4, 2016

> but it's still pretty damn ultra-reliable piece of software.

Are you sure?

http://arstechnica.com/security/2016/09/linux-kernel-securit...

http://www.cvedetails.com/product/47/Linux-Linux-Kernel.html...

yitchelle · on Nov 4, 2016

An ultra reliable software comes down to knowing its behaviour fully in its released form.

If the code has 100s of gotos and their paths are mapped out and its ramifications of the paths are clearly understood for all scenarios, then it is still reliable as its full behaviour is understood and is safe.

kccqzy · on Nov 4, 2016

I would also say perhaps

    avoid recursion

should be revised to

    avoid non-tail-call recursion

Because the justification to avoid recursion is basically avoid running into the limit of stack, and tail-call recursion uses no stack space at all. Of course you still might end up with situations where the function does not terminate, but simple loops can do that too. In relation to Rule No.2, it is also possible to prove an upper bound for recursive depths.

hvidgaard · on Nov 4, 2016

Recursion is recursion. Tail recursion is optimized away by the compiler except when it isn't (I've seen bugs with this exact setup more than once), but now you have to verify that the compiler actually does this. It's simpler to just avoid it all together unless you want to formally verify the recursion depth.

4ad · on Nov 4, 2016

Apart from that, TCO destroys information that is valuable for debugging (especially for non-recursive tail calls), exactly what you want to avoid in safety-critical software.

pja · on Nov 4, 2016

Manual conversion to a loop also destroys the same information, so it’s a wash either way on that front.

4ad · on Nov 4, 2016

You can't convert arbitrary tail calls to a loop. You can only convert recursive tail calls to a loop. The information loss associated with non-recursive tail calls is great. It's true that recursive tail calls lose less information, but it is not true that it's equivalent to the information lost in a loop. In a loop you always have access to {head, current, tail} while in a tail call you might (subject to algorithm and compiler optimization) only have access to {current, head|tail}, depending if the loop goes forwards or backwards.

spc476 · on Nov 5, 2016

But I've found that non-recursive tail calls are great in handling state machines. I wrote a TFTP client and server in Lua [1] and the code was both much cleaner and more reusable between both client and server. Granted, Lua does proper TCO (as long as you properly code the return statement) and the lack of stack trace wasn't much of an issue there.

These rules are mostly geared towards C, which doesn't guarantee TCO at all.

[1] Partly as way to learn Lua and TCO, and as a way to manage Cisco router config files in a version control system.

bartread · on Nov 4, 2016

Exactly. Tail recursion can always be rewritten as a loop. The code might be uglier but at least you have a better idea of how it's going to behave after compilation.

naasking · on Nov 4, 2016

> The code might be uglier but at least you have a better idea of how it's going to behave after compilation.

The fact that it's uglier suggests you'd have less of an idea how it's going to behave than simple tail recursion. That said, I agree with the idea that if you're after tail recursion, an annotation that the compiler checks that ensures the call is properly tail recursive is a good idea.

to3m · on Nov 4, 2016

Quite often, when time you suggest using a loop rather than a tail call, people pop up to tell you that the code is neater as a tail call. So I'm assuming the comment about ugliness is a way of heading off these people ;)

I'm happy to argue that in C, it's never neater to write a loop as a tail call. Tail calls aren't guaranteed to compile into a goto, so it can be difficult to say whether it's going to end up with a call or a goto - even on a case-by-case basis. And if you write a tail call expecting a goto, and you get a call, obviously you can't just give up and go home - you'll need to rewrite it as a loop, just like you could have done in the first place. This does not strike me as less ugly than just writing that loop.

I know people like to do things the roundabout way, by writing one thing, certain that the compiler will do something else, and then saying this code is more clear rather than less so - but I've never quite understood why. A better rule of thumb to my mind is that if you want something particular to happen: write the code that way.

(I'm not denying an annotation would help - because then at least you'd know statically when you need to rewrite your code as a loop, rather than having to wait until something bad happens. Though... maybe if nothing bad ever happens... it's fine after all...? Sheesh. Ask a philosopher. Too much for me!)

heisenbit · on Nov 4, 2016

And during the maintenance phase someone modifies the code and what was pure tail recursion becomes something else and literally blows up the rocket.

The whole point of these rules is defensive programming. Programming not what is correct but what is robust i.e. if anything breaks it still produces possibly suboptimal but not catastrophic results.

qznc · on Nov 4, 2016

In safety critical projects people often disable compiler optimizations. No tail call elimination.

krylon · on Nov 4, 2016

> tail-call recursion uses no stack space at all

Not all compilers optimize tail-call recursion into - effectively - loops. And even those that can might only do so when certain compiler flags are active.