Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Meta: Who, and why, flagged this comment? What rule exactly Slavius breaks here?

On topic: I can't recall now the details, but I read a paper once about a system which had no shutdown procedure at all, the only way to exit it was to crash it somehow or just shutdown the computer. The system made sure to save everything often enough and made sure to store the data in ways which allowed for restoring possibly corrupted parts of it on the next startup. This design produced a very resilient architecture which worked well for that use case.

The paper was from '80s or '90s, so it's not like we need to be in 21st century to design that way. I'll try searching for the paper later.



You might be thinking of KeyKOS, and of the anecdote which can be found at https://lists.inf.ethz.ch/pipermail/oberon/2010/005734.html (it should also be at the EROS homepage, but it's down for me at the moment).

See also: "Crash-only software" https://lwn.net/Articles/191059/


Yes, exactly this! Thank you.


The flagger probably was uncomfortable with "FFS". After all colorful expression is bad for HN. b^)

What you're talking about seems like crash-only with Erlang/OTP.


It's similar in effect, but Erlang's ultimate response to the errors is redundancy instead of trying to salvage whatever was left by the process that crashed. I think the transparent distribution of Erlang nodes over the network is what enables Erlang's "let it crash and forget it ever ran" approach. Joe Armstrong said that they want Erlang to handle all kinds of problems, up to and including "being hit by a lightning" - so I think hardware redundancy is the right path here.

The OS[1] I've been talking about was primarily concerned with a single-machine environment, which resulted in slightly different design.

[1] https://en.wikipedia.org/wiki/EROS_%28microkernel%29




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: