Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very pragmatic. He sees software in the overall context of getting a job done with a computer, imperfect though it may be, instead of dying because it was not perfect.

Unlike a segfault from a user space program that indeed merits a 'kill', the kernel should strive at all costs to keep running, since kernel panics are so much more inconvenient.



Just wish that the higher layers of the stack would see it as well, rather than being angry at Torvalds about it and going about how users are dumb sheep.


Absolutely not.

If “do no harm” is a principle, then the kernel should ensure that no harm is taking place.

If flaws within the kernel allow harm to occur while otherwise normal transactions are occurring then it is absolutely preferable to panic and shut down over allowing that potential harm to occur.

To suggest otherwise, that detected errors that allow harm should be allowed, is pure insanity.

Linus is unquestionably wrong in the regaurd.


A thought experiment that comes up in Kernel design classes is what should happen if the OS was running the flight-control software for an Airplane you are on? If there was a bug in the kernel, perhaps a double free or a memory leak, what should happen?

A panic would result in the airplane falling to certain doom. But if it were to keep running, it may be a security vulnerability. Being absolutist in either direction of the discussion will lead to absurd scenarios where you would make the wrong decision.


> double free or a memory leak, what should happen

Both offensive and defensive programming is important in safety critical programs and I get your point, but those things you mention don't' happen in safety critical systems.

There is no dynamic memory allocation. RTOS used will support "brick wall partitioning" for memory, processing and other resources. Different systems can run in the same OS but they cant' compete for processing time, locks or memory access. Everyone has been dealt the resources they can have from the start. It's not possible to run out file descriptors, memory if you allocate them statically from the start.

Assertion errors or monitoring errors in safety critical systems usually cause reset or change into backup system. If the program state is large and reset is not safe, retreating to some earlier state (constant backups) is likely.


but those things you mention don't' happen in safety critical systems.

Errors in logic happen everywhere.


_those_

dynamic memory allocation errors don't happen when there is no dynamic memory allocation.



Kernel is modular. Literally everything can be enabled/disabled.

Aviation has strict regulations and that's why most critical systems have redundant parts. Putting a sigle critical component into plane is stupid in and of itself. Think of simple freezing in high altitude or overheating otherwise. On the other hand I would rather fly in a plane whos altitude meter shuts down and switches to redundant circuit other than letting it report incorrect values...


The flight control system should absolutely panic. No (sane) person designs flight control systems without multiple redundancies.


You seem to be one of those "security people" Linus refers to. ;)

Harm is relative. You security people think that every single security issue is so important that it doesn't matter what harm mitigating that may cause, it can be done. Well, that is not what Linus thinks.

Kernel may terminate a process because it did something of suspect but doing that may actually cause way _more_ harm.

The philosophy here is that security bugs are just bugs.


No. First, you never know if an invalid access or integer overflow can actually be exploited, and even if, you don't know if it can remotely be exploited. If you run a server hosting sensitive data by all means, use grsecurity but on my home PC where I browse Facebook and send emails, fuck off with your kernel panic.


But as he explains, these are latent bugs. They may or may not be targetted for exploitation yet. Meanwhile hard crashing over them can lead to poor user experience, while throwing up warnings and leading to an actual fix would be better from user PoV, unless of course userspace actually depended on the buggy condition, but that's another discussion.


It's not so black and white. Medtronic uses Linux. Do you want to be the guy who's medical equipment spontaneously reboots because of a bug that wouldn't have otherwise affected anything?


Linux is a modular kernel. I'm not aware of a single thing you can't disable or make modular during config/compile. I woudn't like to be the guy who's medical equipment killed him by slowly decreasing his oxygen levels due to buffer overflow either. If you're in this kind of business you take responsibility by discovering and fixing bugs which would go unnoticed otherwise. And if the life of your patients really depend on your equipment then having a redundant component within your device is a must.


You mistake "do no harm" for "don't let the user do harm". This is "do no harm" in the sense that Hippocrates said it; your job as a kernel security dev is to not harm the user, just as a doctor's job is to not harm his patient.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: