Hacker News new | past | comments | ask | show | jobs | submit login

> Undefined behavior isn't bad, it's a unavoidable consequence of all languages, programming or otherwise.

This is incorrect.

"Undefined behavior" is a specific technical term for a scenario in a program that the compiler is permitted by the specification to assume will not happen, for the purposes of optimization. For instance, this code:

    int silly(int a) {
        if (a + 5 > a) return 0;
        return 1;
   }
can be optimized to just "return 0", because, as a human would read it, obviously a + 5 > a. So the spec says that signed integer overflow cannot occur to allow the compiler to optimize this as a human would want the compiler to optimize this. (Whether the spec actually matches human expectations is a good question, but in general it's right, and forbidding all undefined behavior in C and C++ would cause you to miss out on tons of optimizations that you obviously want.)

"Undefined behavior" does not mean providing invalid input to a function and getting an exception, or a crash, or a particular error result, if the result is well-defined. For instance, this is defined behavior:

    >>> import math
    >>> math.sqrt(-1)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: math domain error
because Python defines the behavior that math.sqrt(-1) raises a ValueError and I can reliably catch that exception. If it were undefined behavior, then I wouldn't have any guarantee of being able to catch the ValueError: the Python interpreter might just choose to return 27 if it's more convenient.



Thanks for the clarifications. Indeed this is what I meant. Undefined behavior in particular isn't unavoidable, though the existence of meaningless statements often is.

Python's philosophy of giving well-defined behavior to meaningless code seems wasteful. If your code is unintentionally executing sqrt(-1) (or unintentionally indexing out of bounds, etc) then something is wrong with your program. You don't necessarily need the behavior to be defined if there is a bug in your program. In the case that you want something predictable to happen, better to just abort(). Catching ValueError/IndexError in those cases is futile, how can one trust a buggy program to handle itself?

Python using exceptions to signal both runtime errors and programmer errors is a design smell. The former should always be caught and handled, the latter should never be caught and should probably only reasonably abort().


> If your code is unintentionally executing sqrt(-1) (or unintentionally indexing out of bounds, etc) then something is wrong with your program.

This is not Python's design. You might dislike Python's design, sure (many people do!), but it is recommended Python practice to try to do something, catch the well-defined exception, and implement the fallback rather than to check in advance.

https://docs.python.org/3.5/glossary.html#term-eafp

https://blogs.msdn.microsoft.com/pythonengineering/2016/06/2...

> Catching ValueError/IndexError in those cases is futile, how can one trust a buggy program to handle itself?

The program is not buggy.

(Also, it's absolutely possible to design software where bugs are contained to part of the code: for instance, if you hit a bug in your Python code while handling a web request, you should return a 500 to that user and carry on instead of aborting the whole web server. In fact, it is precisely the relative lack of undefined behavior in Python that makes this a good approach: if you were doing this in C, a wild pointer could well corrupt some other server thread!)


I said unintentionally. Intentionally running code that may throw an exception and catching it predictably (in the normal EAFP fashion, which I am well aware of) doesn't apply to my argument.

I'm arguing that an unintentional or unexpected ValueError, IndexError, TypeError, etc. means your program is buggy and there is no sane/practical way to 100% safely recover from it outside of restarting the process.

Your example of the catch-all handler in a modular HTTP server is inappropriate and dangerous. Continuing to run a server with persistent in-process state after an unexpected exception risks corrupting data. Just because Python doesn't have C pointers doesn't mean it can't have inconsistent or incorrect state that can affect other server threads.


> Python's philosophy of giving well-defined behavior to meaningless code seems wasteful.

The code has a well-defined meaning, it's just that the input domain is larger than the one defined in C.


Just about all programs have bugs. There's a huge difference trying to defend against any possibility of undefined behaviour happening, because it may result in very bad unpredictable things happening to your computer, and defending against bugs that may result in the error getting signaled in a well defined way.

This is the fundamental difference that makes C/C++ unsafe programming languages.

(Yeah, this specific case of the exported add funciton is unlikely to be too apocalyptic, but the general guarantees are important).


Rust, Python and other so-called safe languages are unsafe in the same fundamental ways C/C++ are unsafe. Safety is a larger class than "memory safety" which is what you are referring to. As long as the language permits running despite the existence of a bug / programming error, it is unsafe. RAM may not be corrupted, but state can nonetheless be left inconsistent and cause undesirable behavior ("unsafe" behavior).


That's all true, but there's a reason that we define "safe languages" in this way: it's that you can isolate parts of the program from other parts of the program, and know with confidence that a failure in one part of the program will not corrupt data in another part of the program.

Again, my proposal is to wrap only that part of the program which doesn't interact with shared state in a giant try/catch block (and you can know with 100% reliability what that part is in a safe language). The parts about taking a request off the queue, or storing results in a shared data structure, or whatever, should be outside of the try/catch block, because if they break, your shared state is indeed at risk.


Your proposal doesn't scale and doesn't apply in general. Neither Rust nor Python enforce high-level data structure integrity (e.g. transactionally pulling data off one queue and stuffing in another). A team of 100 programmers will get this wrong. Big try:except:log:continue is the exception (rimshot), the rule should be to abort().


> Neither Rust nor Python enforce high-level data structure integrity (e.g. transactionally pulling data off one queue and stuffing in another).

Rust absolutely does. For instance, if you unwind while holding a lock, the lock gets poisoned, preventing further access to the locked data. If you're not holding the lock, no safe Rust code can possibly corrupt it, unwinding or no unwinding. So you definitely can put a lock around your two queues, operate them from multiple threads running safe (or correct unsafe) code, and be robust to a panic in one of those threads.

I'm less familiar with Python's concurrent structures, but as I keep saying, this is why you leave the stuff that touches the data structure outside of your try/except - Python does guarantee that a thread that doesn't have explicit access to a shared object can't mess with the shared object by mistake.


Just because Rust has safe guards for lock usage during unwinds doesn't mean it prevents all high level data structure inconsistencies or even just plain old bugs.

Doesn't matter how you choose to handle invalid semantic forms, either via undefined behavior, error code, exception, or assert, as long as you silently ignore it, your code is unsafe. Rust doesn't have undefined behavior but that doesn't mean it doesn't suffer from silent errors. E.g. Returning NaN from sqrt(-1) or signed integer overflow wrapping.

That's my entire point.

As a programmer your intent is to use APIs in the manner they expect. An invalid use is an invalid program. Garbage in, garbage out. No amount of subsequent error handling is going to help you. Better to abort().


Yes, if you break the contract, better abort. But throwing errors can be part of the contract, even for some programming errors. Failure tolerance, resilience etc.


A runtime exception is fine to handle. Like ENOENT, etc. these are expected and your program can be designed to handle these errors.

A programming error is a sign that your program is not operating the way you expect. No correct program should ever call sqrt(-1) or overflow arithmetic.

Outside of effectively aborting the process, what other way is there to safely handle a programming error (aka bug) when encountered?


Not all programming errors lead to incorrect programs (correctness being defined by the language).

You shouldn't call sqrt(-1) in C, and if you do, you abort. But maybe you are not supposed to call sqrt(20) either, because 20 is a sign your programmer did not understand the application. In that case, the programming error is still a correct program.

In languages like Python, or Lisp, there are a whole set of programming errors like dividing by zero, calling length on something that is not a sequence, etc. that are designed to not crash the system (nor make it inconsistent), in part because those errors can happen while developing the code and are just of way providing an interactive feedback.

Now, if you ship a product and there is a programming error that manifests itself while in production, you better not try anything stupid, I agree.


You essentially agree with me. Aborting with a stack trace is still an abort. It doesn't need to be catchable.


You are speaking as if it was an All-or-Nothing situation.

> RAM may not be corrupted, but state can nonetheless be left inconsistent

... yes, but at least RAM is not corrupted, that's a little step towards reliability. And if you can manage your state in a transactional way, that's another step.

Do you restart your OS when your program crashes?


I never said it was an all or nothing. My point is that trying to handle unexpected errors due to programming error is not safe and Python allows that.

It doesn't matter if throwing ValueError exceptions on sqrt(-1) is well-defined, continuing to run the program by ignoring the exception is no less harmful than silent integer overflows or buffer overruns.

I don't restart my OS when a process crashes because it has been designed to use hardware mechanisms to clean up dead processes. I absolutely do restart my OS when it kernel panics, it doesn't try:except:log:continue.


Those are hardware mechanisms backed by software that tells the hardware mechanisms what to do. You have a lot of trust in them!

I recently discovered that my Windows machine wouldn't boot because my boot sector had been replaced with some random XML. That's exactly the sort of thing that hardware protection is supposed to prevent - nothing during a normal run of the OS should be writing to the boot sector, at all.

Do you restart your OS when it oopses and kills a process? Linux in fact catches bad memory accesses from kernelspace and attempts to just kill the current process and not the whole kernel.


I trust the code as long as it's behaving correctly, when it encounters a bug I no longer trust it and I shut it down before it can do further harm. A modular HTTP server should do the same.

The OS/process analogy doesn't hold here. The process has completely isolated state from the kernel.


> The OS/process analogy doesn't hold here. The process has completely isolated state from the kernel.

In one direction. That's why I'm asking you if you reboot your machine when your kernel dereferences a wild pointer when executing a system call on behalf a process - in theory it could have corrupted the kernel itself or any process on the system, but Linux makes a practice of trying to just abort execution of the system call, kill the process, and keep going.


If that's what Linux does, that seems fully intentional and the possible consequences on kernel state are probably well-thought out. Are you claiming what Linux does normally is unsafe and could possibly corrupt kernel state? Like every EFAULT? If that's not your claim, then the analogy doesn't hold and you're entirely missing my point.


That is absolutely my claim, and I am absolutely claiming that it is not well-thought-out - it's literally doing this in response to any fault from kernelspace. If you were unlucky enough that the wild pointer referred to someone else's memory, well, sucks to be you, some random process on your system (or random file on disk, or whatever) has memory corruption and nobody even knew.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: