Hacker News new | past | comments | ask | show | jobs | submit login
Concurrency in Rust (rust-lang.org)
282 points by SirNoobsAlot on March 27, 2016 | hide | past | favorite | 155 comments



Send + Sync are great. The downside of concurrency in Rust is:

1) There isn't transparent integration with IO in the runtime as in Go or Haskell. Rust probably won't ever do this because although such a model scales well in general, it does create overhead and a runtime.

2) OS threads are difficult to work with compared to a nice M:N threading abstraction (which again are the default in Go or Haskell). OS threads leads to lowest common denominator APIs (there is no way to kill a thread in Rust) and some difficulty in reasoning about performance implications. I am attempting to solve this aspect by using the mioco library, although due to point #1 IO is going to be a little awkward.


> 1) There isn't transparent integration with IO in the runtime as in Go or Haskell. Rust probably won't ever do this because although such a model scales well in general, it does create overhead and a runtime.

By "transparent integration with the runtime" you mean M:N threading. M:N threading is just delegating work to userspace that the kernel is already doing. There can be valid reasons for doing it, but M:N threading isn't us not doing the work that we could have done. In fact, we had M:N threading for a long time and went to great pains to remove it.

In addition to the downsides you mentioned, M:N threading interacts poorly with C libraries, and stack allocation becomes a major problem without a precise GC to be able to relocate stacks with.

M:N will never be as fast as an optimized async/await implementation can be, anyway. There is no way to reach nginx levels of performance with stackful coroutines.

> OS threads leads to lowest common denominator APIs (there is no way to kill a thread in Rust)

This has nothing to do with the reason why you can't kill threads in Rust. We could expose pthread_kill()/pthread_cancel() on Unix and TerminateThread() on Windows if we wanted to. The reason why you can't terminate threads that way is that there's no good reason to: if you have any locks anywhere then it's an unsafe operation.

> some difficulty in reasoning about performance implications.

I would actually expect the opposite to be true: 1:1 is easier to reason about in performance, because there are fewer magic runtime features like moving or segmented stacks involved. Could you elaborate?


There's no way to kill goroutines either. In fact, are there any systems that allow you to cleanly kill threads?


Yes.

In Erlang:

    exit(kill).
or exit(Pid,kill).

Will kill a process. It has an isolated heap, so it won't affect other (possibly hundreds of thousands of) running processes. That memory will be garbage collected, safely and efficiently.

This will also work in Elixir, LFE and other languages running on the BEAM VM platform.

EDIT: masklinn user below pointed out correctly, the example is exit/2, that is exit(Pid,kill). In fact it is just exit(Pid, Reason), where Reason can be other exit reason, like say my_socket_failed. However in that case the process could catch it and handle that signal instead of being un-conditionally killed.


This is called from within a threads execution, right? I think the question is about being able to kill a thread externally.

Java had this in 1.0 or 1.1 and then thought better of it and deprecated the API.


> This is called from within a threads execution, right? I think the question is about being able to kill a thread externally.

Yes the GP has the wrong arity, exit/1 terminates the current process but exit/2[0] will send an exit signal to an other process and possibly cause them to terminate (depending on the provided Reason and whether they trap exits).

This is a normal behaviour of Erlang which enables things like seamless supervision trees, where exit signals propagate through the supervision trees reaping all of the processes until they encounter a process trapping exits and a supervisor can freely terminate its children[1]

This can work because erlang doesn't have shared state[2][3], and BEAM implements termination signaling (so processes can be made aware by the VM of the termination of other processes)

[0] http://erlang.org/doc/man/erlang.html#exit-2

[1] http://erlang.org/doc/design_principles/sup_princ.html#id740...

[2] between processes, state is always owned by specific processes and queried/copied out across the process boundary

[3] and thus a process being terminated can't leave inconsistent state behind for others to corrupt themselves (or the VM itself) with


erlang also uses per-actor message queues and has a kill safe design philosophy, so it's not a problem.


It works correctly either way -- externally with exit(Pid,kill) or by the process itself as exit(kill). The last one is just a shorthand for exit(self(), kill). Where self() is the process id of the currently running process.


> It works correctly either way

But the way you showed was not one in which anyone was interested, synchronous exceptions work in more or less every language, and you can't assume readers know your self-killing is actually implemented via asynchronous exceptions since they don't know the language.


Every one I know of has regretted it, and seen it as an antipattern. For example, Java way back in 1.5: http://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPr...

I think Erlang might be okay with it, because "this thread can fail at any time" is a core value of Erlang. But it's an exception.


Haskell has killThread, which rather than being an anti-pattern is often used as an effective way to accurately enforce a timeout on a thread. This functionality seems like it would be very difficult to achieve with most other runtimes. https://news.ycombinator.com/item?id=11370004


You have a sibling comment elsewhere in the thread which disagrees; I'll leave that argument to that sub-thread.


Yes. In Haskell you use `killThread` which throws an asynchronous exception to the thread. It is certainly difficult to perfectly cleanup resources in the face of asynchronous exceptions. However, once there are functions available to help you with this (e.g. use a bracket function whenever using resources) it becomes tractable.

This functionality is critical to being able to timeout a thread.


AFAIK combining killThread and bracket (or just about anything really) is fraught with issues[0] and hardly qualifies as clean.

[0] http://blog.haskell-exists.com/yuras/posts/handling-async-ex...


Yes, there is still an issue with async exceptions when there is an exception during the cleanup handler of bracket. Probably this has not received the attention it deserves because fundamentally if a cleanup handler throws an exception you may well still have resource issues. But the article also proposes ways of solving this issue, so lets not give up on async exceptions.


GHC has throwTo, which raises an exception in another thread:

http://hackage.haskell.org/package/base-4.6.0.1/docs/Control...

This is used to provide the killThread function:

http://hackage.haskell.org/package/base-4.6.0.1/docs/Control...


Note that this isn't exactly a safe operation, since a killed thread may stop in the midst of something. It's safer to have it process messages on a loop and include a quit message.


POSIX has pthread_cancel. It's a big mess.


Can you clarify what you mean by "kill goroutines?" Because my understanding was if you return while inside a goroutine it get's handled by the GC immediately, and (as someone else mentioned) you can use context to send deadlines/cancellation signals to go routines.


The ability to kill an arbitrary goroutine from the outside. To use context you need to write your specific goroutine such that it checks for cancellation and will eventually handle a cancellation request. This cannot be done with an arbitrary goroutine.


You can use contexts to send a cancellation signal to goroutines: https://blog.golang.org/context.

This is more of an implementation detail you make on a case-by-case language rather than a builtin to go.


That's not a way to "kill goroutines". That's a way to "ask goroutines to die when they get around to it." Useful, but a fundamentally different thing. Go does not have a way to kill goroutines, nor, per some of the other discussion in this thread, do I ever expect it to.


> There is no way to kill a thread in Rust

Think about the interaction with (non-memory) resource ownership. This is just horrible, and I wouldn't even want it in a higher-level language. If you want to carefully notify threads that they must terminate, set up a channel, or write to a shared variable, but please do not just forcibly terminate threads.


If thinking about ownership of orphaned data is horrible, that still doesn't mean that there is no general solution.

I can't support your argument, because I'm not capable. If I was, I probably wasn't asking in the first place.


Let me interpret the situation in managed languages from a Rust programmer's lens: All managed resources are actually owned by the runtime, and merely borrowed by your program. Thus, killing threads is “safe”: no managed resource can possibly become orphaned. The result is very pleasant as long as your program only uses runtime-managed resources. But things quickly become hairy when you want to use foreign libraries (typically written in C, or exposing C-compatible interfaces), because it's very difficult to arrange things so that cleanup routines are guaranteed to be called before your thread is killed.


Would seem silly to use rust for all it's safety features to then call into c libraries, anyhow. So much for Heartbleed wouldn't happen with rust.


> Would seem silly to use rust for all it's safety features to then call into c libraries, anyhow.

Why? One very important use case for Rust's unsafe sublanguage is making wrappers around C libraries that can be used in the safe sublanguage. If anything, using C libraries is more pleasant in Rust (or C++) than in managed languages, because you don't particularly need to accomodate or work around the idiosyncracies and quirks of a complex runtime system. That's pretty much the entire point of my above post (GP to this one).

> So much for Heartbleed wouldn't happen with rust.

You may not agree, but the position consistently taken by Rust is that, while avoiding unsafe code is highly desirable, it isn't always possible. What is always possible is to isolate unsafe code, so that, if the safety guarantees of the safe sublanguage are ever violated, you know what parts of your program to audit.



I think Steve Klabnik could clarify this, but the book at that link is in the process of being rewritten. I think it might be good to wait until it is. I personally found it slightly difficult to follow compared to other options like the soon to be published Programming Rust.


I am in the middle of working on a second draft of the book. This page is one of the oldest bits of docs, overall, and isn't my best work. It's not _wrong_, I just have very high personal standards. It was adapted from older documentation and was written in the time up to 1.0, where I had a LOT on my plate.


It certainly is _confusing_ if not _wrong_. You silently add "move" to the closure without any mention or explanation (then later say "note that we're copying i" without explaining that you're talking about the "move" keyword).

The bit about Mutex also has a "just type this to fix the problem with no explanation of how or why it works" flavour (although I guess if you already grok mutexes then its use here might be obvious to you).

Not a criticism of the tutorial, but it does something common in Rust tutorials which is really a problem with the language at this point: Rust tutorials always spend a lot of time interpreting Rust's notoriously poor error messages (e.g. "what this message that doesn't mention Sync is trying to tell you is that you need Sync on this type"). That's great when you're doing the tutorial, but as soon as you're on your own man are those errors frustrating.


> Rust's notoriously poor error messages

I've actually heard the opposite feedback; Rust's error messages try to be super helpful.

Note that concepts like ownership and Sync are new to most programmers, and it's impossible to explain them in an error message. This is where the extended error messages (via --explain) and the book come in.

The tutorial has this style because Rust espouses catching things at compile time, so the tutorial demonstrates this being done by doing the wrong thing a few times and reiterating why it gave an error.

----

Could you give examples of confusing error messages? I'd love to improve them. The one you mention .... doesn't exist. This (http://is.gd/keukPm) is what that error message looks like, and (a) it mentions Sync, (b) it also mentions that `Arc<bool>` cannot be shared between threads safely.

If you were referring to "So, we need some type that lets us have more than one reference to a value and that we can share between threads, that is it must implement Sync." from the book, the last part about Sync has nothing to do with that error message. If you follow the book, the error message is clear without the context of threads -- `data` was moved into the first spawn() call and the subsequent ones can't use it. One does not conclude that Sync is necessary from this error message, and that's not what the book is trying to say.

This sentence is actually skipping a step, one like http://is.gd/RPNOm4, where the compiler asks you for a Send type (or a Sync type, depending on the exact code). Instead of stepping through this example, it just introduces Sync directly by noting that we're dealing with threads anyway and the reader already knows what Sync/Send are. It doesn't conclude that Sync is necessary from the error message.

> The bit about Mutex also has a "just type this to fix the problem with no explanation of how or why it works" flavour (although I guess if you already grok mutexes then its use here might be obvious to you).

The previous sentence says "for example a type that can ensure only one thread at a time is able to mutate the value inside it at any one time.", which is exactly what a mutex does. Unless you want a lower level explanation which IMO isn't necessary. It could explain locking more though; I'll fix that.


I've found the Rust compiler to provide very useful messages. That said, it does take a certain amount of experience in the language before you grok the terminology and appropriate remedies to overcome some compiler errors. As a beginner, I'd typically open about 5-10 different documentation, blog post, SO and tutorial pages to try to figure out what I was doing wrong. Once I understood the underlying concept, I could go back and understand what the compiler message was telling me, as plain as day. After awhile, I had a strong enough understanding of the borrow checker that I could usually interpret the compiler error messages in a pretty straightforward fashion. But it takes time if you're coming from a non C/C++ language!

If you're learning Rust, I would recommend tracing back every compiler error you encounter to this page [1] to see some other examples and resolutions (if the compiler's suggestions don't make it clear as to what to do). It's the most underrated page in the entire Rust documentation set, and could be the launch point for a lot of teaching and learning. Read the book to start, certainly (and the revisions Steve Klabnik has been making are very, very good), but that error index page is very valuable for the day to day issues.

Rust rewards gumption [2] and perseverance.

[1] https://doc.rust-lang.org/error-index.html [2] https://en.wikipedia.org/wiki/Gumption_trap


> you encounter to this page [1]

I love the extended errors. Note that you can just use `rustc --explain <error code>` without opening a web page, but of course it won't be formatted then.

> But it takes time if you're coming from a non C/C++ language!

Ultimately there's little substitute to learning the concepts behind the language :) Error-driven-development is fun, and quite useful once you know the language, but if used as the only way to learn the language it can be problematic. That said, I enjoy using errors to explain Rust concepts -- as long as you explain them. If a new user is hit with a steaming pile of errors you can't really expect them to understand it directly.

I have talked with a lot of folks coming from a non-systemsy background, and sort of have an idea of the things that get confusing (I once had similar problems when learning C++). I'm planning a blog post series (no ETA, pretty swamped with other work :/ ) that teaches both Rust (specifically, the safety bits like the borrow checker) and what's going on under the hood (regarding memory, the stack/heap, etc) using each as a crutch to explain the other in a leapfrog way, to introduce the language and low-level programming to folks coming from languages like JS.


Yeah, this is what I mean. So this was originally written before "move" or even any of the current closure implementation existed. And with so much to do, the focus was on making what existed accurate more than a holistic approach.

Furthermore, almost a year after 1.0, we have a lot better understanding of what people struggle with when trying to learn Rust, so we have a better idea of how to teach it.

Please file bugs for notoriously poor errors. We have put a lot of work into many of them, but there's a lot of ways to go. It seems like most people really like them or really hate them.


  Rust tutorials always spend a lot of time interpreting 
  Rust's notoriously poor error messages
I'm wondering where you got the sense that Rust's error messages are "notoriously poor." Can you expand on that?

I have always found Rust's error messages to be quite good. Of course, they can be hard to interpret occasionally if you don't already grasp the concepts they are referring to. But I don't see any way to solve that in error messages; at some point, you have to learn enough about the language for the error messages to make sense. And occasionally, the suggestion they provide for how to fix the issue isn't the right suggestion, but I don't know if there's a way to always do that without strong AI that understands what you mean. The best you could hope to do is to provide suggestions for all of the possible fixes, though that could lead to some messages that are too verbose to the point of being useless.

I think it would help to provide some examples of error messages you have had trouble with, to help figure out ways to make them better.


FWIW, I fixed some of these concerns here: https://github.com/rust-lang/rust/pull/32529


> what this message that doesn't mention Sync is trying to tell you is that you need Sync on this type

There's no such error message that I'm aware of.


Thanks Steve, you're doing great work. Can't wait to read the book once its done!


Thanks! You can follow along with the progress here, if you're interested: https://github.com/rust-lang/book


Another thing to know about rust concurrency is that it supports safe "scoped" threads, or threads which have plain references to their parent threads stack.

This makes it very easy to write, for instance, a concurrent in-place quicksort (this example uses the scoped-pool crate, which provides a thread pool supporting scoped threads):

    extern crate scoped_pool; // scoped threads
    extern crate itertools; // generic in-place partition
    extern crate rand; // for choosing a random pivot

    use rand::Rng;
    use scoped_pool::{Pool, Scope};

    pub fn quicksort<T: Send + Sync + Ord>(pool: &Pool, data: &mut [T]) {
        pool.scoped(move |scoped| do_quicksort(scoped, data))
    }

    fn do_quicksort<'a, T: Send + Sync + Ord>(scope: &Scope<'a>, data: &'a mut [T]) {
        scope.recurse(move |scope| {
            if data.len() > 1 {
                // Choose a random pivot.
                let mut rng = rand::thread_rng();
                let len = data.len();
                let pivot_index = rng.gen_range(0, len); // Choose a random pivot

                // Swap the pivot to the end.
                data.swap(pivot_index, len - 1);

                let split = {
                    // Retrieve the pivot.
                    let mut iter = data.into_iter();
                    let pivot = iter.next_back().unwrap();

                    // In-place partition the array.
                    itertools::partition(iter, |val| &*val <= &pivot)
                };

                // Swap the pivot back in at the split point by putting
                // the element currently there are at the end of the slice.
                data.swap(split, len - 1);

                // Sort both halves (in-place!).
                let (left, right) = data.split_at_mut(split);
                do_quicksort(scope, left);
                do_quicksort(scope, &mut right[1..]);
            }
        })
    }
In this example, quicksort will block until the array is fully sorted, then return.


Reading all this is making me happy about pursuing Elixir (which is of course a language addressing largely different use-cases)


I'm having a tough time trying to understand this snippet

    for i in 0..3 {
        thread::spawn(move || {
            data[i] += 1;
        });
    }
What is the 'move' thing here before the ||


So, for those who may know JavaScript, you may have seen code like this:

  var closures = [];
  for (var i=0;i<5;i++) {
      closures.push(function() {
          console.log(i);
      }); 
  }
The code above will result in incorrect results: 5, 5, 5, 5, 5. Because you're capturing `i` as a reference.

To avoid this, JS devs typically do this:

  closures.push((function(i) {
      return function() {
          console.log(i);
      };
  })(i));
Or, if you can afford the ES6 support:

  for (let i=0;i<5;i++) {
      /* `let`-scoped `i`s are created individually at each iteration, so it is safe to capture them by references. */
  } 
Rust supports this pattern by a built-in syntax. That's what `move` means. If you prepend the `move` keyword before the closure, the variables will be captured by values, not references.


The ES6 "fix" is an abomination in my opinion. You're closing over a mutable variable, so you should see the mutations! IMO the real fix would be to write

    for (var i = 0; i < 5; i++) {
        let i_ = i;
        /* use i_ in the closure */
    }
In your ES6 solution if you e.g. increment "i" inside the loop after the closure the closure will see the mutation!

The real cause of confusion is mutation and javascript's scoping rules.


If you write `let data = data;` in the closure, then you can achieve the effect of `move` without the special syntax (unless `data` can be copied, in which case there is really no way to force it to be moved inside the closure without `move`).

However, in Rust, you get an error if you accidentally capture by reference in the closure passed to `thread::spawn` so it's not as hard to get right as it is in JS.


ES6 fix is:

      for(let i = 0; i < 5; i++) {
          // no hacks needed if you don't use ES5 var
      }


Thanks for the explanation, but how would the move keyword know that you're referring to the i variable? Ie - what is there were multiple variables (such as a for loop with j in there)?

Wouldn't it have been a better approach to add some sort of demarkation, such as i* or i^ (or whatever) to indicate this?

Just curious.


> Wouldn't it have been a better approach to add some sort of demarkation, such as i* or i^ (or whatever) to indicate this?

That's the path C++ took[0], the Rust people thought it had too much syntactic and semantic overhead, and that having just "move" and "referring" closures would be much simpler. If you want to mix them up, it's easy enough to create references outside the closure (and capture them by value with a move closure)

[0] http://en.cppreference.com/w/cpp/language/lambda#Lambda_capt...


There was a really good thread on this on /r/urust: https://www.reddit.com/r/rust/comments/46w4g4/what_is_rusts_...

In particular, don't miss this post: https://www.reddit.com/r/rust/comments/46w4g4/what_is_rusts_...


A closure captures all variables from its environment that it is using (no more than that). They can be captured by reference (which is the default), or by-move (which is done with the `move` keyword). In case it is being captured by move, _all_ captured variables will be moved. If you wish to capture a specific variable by reference in a move closure, create a reference (`let y = &x`) outside of the closure and use the reference y instead of x inside.


In short, the "move everything" nature of "move" keyword here is not an issue because you can just make references for items you don't want moved. References themselves are "moved" by just copying their addresses.


It moves everything you are referencing in the closure from the outer scope to the closure.


I agree that this seems ugly


A move closure takes copies of the environment, instead of mutating the parent environment.

https://doc.rust-lang.org/book/closures.html#move-closures


It's more accurate to say that it takes ownership of the captured environment. Whether that means copying the values or moving ownership depends on whether the values are of a type that implements Copy


But isn't it confusing that it is called "move" instead of "copy" if it takes copies?


Move and Copy are identical at the assembly level. The only difference is what you can do with the older binding. Semantically speaking, both cause a memcpy, though the optimizer may elide them.

That said, you're right that saying "copy" is misleading, for this reason. But moves _are_ a kind of copy.



Great thanks.


Does Rust have a way to work with SIMD concurrency as opposed to just fork/join concurrency? Something along the lines of how openmp or cilk let you do a parallel for all?


You seem to have concurrency confused with parallelism. Concurrency just means working on another task before the prior one has completed. You're describing parallelism which is doing multiple tasks at the same time.


Parallelism necessarily implies concurrency [1]

So, "SIMD concurrency" is not incorrect (although SIMD parallelism is more correct :)

1: http://programmers.stackexchange.com/a/155110


This not true. CPUs have instruction parallelism, even on a single CPU, but there is no observable concurrency.

SIMD is data-level parallelism, not concurrency.


By SIMD concurrency, I meant that I was curious about a construct that actually translated into simultaneously executing code, not just a construct that denotes work that "could" be done simultaneously.


Rust supports SIMD, with some utility structs that let you do it easily. I don't know of any libraries that auto-simd things though IIRC LLVM can do this on its own sometimes.


I think the parent was referring to "lightweight task"-based concurrency like C#'s PLinq:

    // C# 
    int sum;
    Parallel.ForEach(myCollection, item => sum += item);
Which operates using a reasonably sized thread pool rather than a thread per item. Something similar in Rust (Excuse my rusty rust) would look like.

    let someList = ...
    parallel::for(someList.iter(), |item| {
       // Do thing with each item
    };
I think there is ongoing work on this topic, but it will likely only be library-level and not language-level (Just like Linq is a language level feature in C# but PLinq is a library). There are some third party crates that do this like simple_parallel.

Edit: Low-level simd exist as intrinsics and of course through llvm vectorizations.


The simplest library for this kind of data parallelism is Rayon[1]. The README provides a nice overview and links to a series of blog posts about the details.

[1] https://github.com/nikomatsakis/rayon/


simple_parallel exists and is pretty neat, yes. Rayon is another cool library here (https://github.com/nikomatsakis/rayon), which lets you do foreach/mapreduce operations via work stealing on a pool with an extensible API that looks exactly like the regular iterator api (https://github.com/nikomatsakis/rayon#parallel-iterators, for example).

Note that a simd library exists which abstracts over the intrinsics (https://github.com/huonw/simd)


A simple google search would have told you that there is work being done the introduce simd in Rust. There appear to be some basics there right now but not much if i understood my quick search.


An active discussion about concurrency in rust with first developers commenting is a very appropriate place to ask that question, and ultimately much more likely to yield valid and current information than Google.


This looks terribly overcomplicated/overengineered to me, to the point where I doubt many are going to adopt/switch to this style, esp when used to more convenient approaches [even the standard C++ approach, faulty as it may be].

Also note how much boilerplate one has to write and how the code snippets bypass error handling (do it differently in "real" code but don't show us how). Bleh.


Try it before you mock it.

This prevents boiler plate issues, and allows the compiler to help you discover threading issues at compile time rather than runtime.

It's easy enough to just mark all you structs send+sync and still shoot your foot off just like in any language. The point is, you need to be explicit that your trying to shoot your foot off, as opposed to other languages which basically pull the trigger for you.


> overcomplicated/overengineered to me

You don't have to worry about most of this. Doing concurrent things in Rust is pretty clean. Designing new concurrent abstractions from scratch is where you need to worry about Send and Sync and be careful. And it's totally worth it, entire classes of concurrency errors just go away.

The error handling can get verbose, though with the new `?` operator and `catch` syntax it's much cleaner now.


Can someone enlighten me as to why the first snippet has a data race? Won't the resulting array become [2,3,4]?

    let mut data = vec![1, 2, 3];

    for i in 0..3 {
        thread::spawn(move || {
            data[i] += 1;
        });
    }



I don't think there's a data race there, but the compiler can't check that. What the compiler sees is more than one thread accessing the variable `data`, which could cause a data race.


What about the assumption (fatally flawed decision?) that malloc never fails when rust asks? Sounds like something that could affect concurrency


A failed malloc aborts the process. If this is important to you, don't use the heap abstractions in the stdlib then. This is no different from the situation in C++.

(You can also plug in a custom allocator which behaves differently)


On Linux malloc never fails actually. Instead, the kernel kills processes if it runs out of memory.


Overcommit doesn't mean that malloc never fails. malloc will fail if you ask for an allocation that won't fit in the virtual address space (for instance because of your rlimit settings, or if you asked for a 3G chunk on a 32-bit system).


A lot of folks are pointing out that malloc can fail, which is true, but the important part is that there are situations where your application will just abort randomly in the middle of nowhere (i.e. not during malloc) and there's nothing you can do about it. There are also situations where malloc fails and returns a null, but given the existence of situations with no errors on a malloc, handling the error in these cases isn't close to a solution. No language or stdlib can solve this problem 100%.

At a baremetal level, it helps having this, but you're probably better off not using the stdlib anyway.


Bullshit. Every part of the C++ standard library (which is _much_ bigger than Rust's) can gracefully propagate resource exhaustion errors, including memory allocation failure, to callers. That Rust can't is a design flaw.

> there are situations where your application will just abort randomly

That's sure as hell not the case in any environment I choose to use.


> can gracefully propagate resource exhaustion errors, including memory allocation failure, to callers

only on malloc. If the kernel overcommits, your process will abort when you try to use the memory, possibly way after the malloc and there's nothing you can do about it. That's the point being made here.

> That Rust can't is a design flaw.

(This is false, see Steve's reply above about this)


The world is not Linux. I happen to believe that believe that overcommit in the Linux kernel is a disgrace. It is, however, at least possible to disable it. It's not possible to retroactively add real exceptions to Rust, or to change the signature of all memory-allocation function to return Result.

Rust is supposed to be a general-purpose systems programming language, not a Linux programming language. Windows does not overcommit. A correctly configured Linux system does not overcommit. Lost of embedded systems don't (and can't) overcommit. Are you saying all of these people should avoid Rust's standard library?

> > That Rust can't is a design flaw.

> (This is false, see Steve's reply above about this)

It's clear that my opinion differs from that of many Rust developers and users. I still think I'm correct, that these developers and users are misguided, and that as Rust attempts to fill more niches, experience will show that my position is the correct one.

All I can say is that I personally will not use any language that bakes cornucopian assumptions about memory baked into its core library. I know that you say that it's possible to just avoid stdlib --- but the temptation to use it will be irresistible, and once somebody succumbs to the temptation, the entire program is now capable of aborting irrecoverably.

I will stick with languages . Modern C++ is safe and expressive enough, and it correctly reacts to resource exhaustion.


> The world is not Linux.

Sure, but if linux has this issue, then C++ programs on linux will also have this issue, and the language can't solve that. That's all my point was.

> or to change the signature of all memory-allocation function to return Result.

When custom allocators part 2 happens, you can. I've already argued the "real exceptions" part above.

> Rust is supposed to be a general-purpose systems programming language, not a Linux programming language. Windows does not overcommit. A correctly configured Linux system does not overcommit. Lost of embedded systems don't (and can't) overcommit. Are you saying all of these people should avoid Rust's standard library?

No. My point was simply that no language has a complete solution to this problem.

Most people don't need to worry about OOM; abort-on-OOM is the expected behavior. For the people who do, there is a mechanism to handle it, as explained above. I can't help it if you have an idealogical issue with that mechanism. But ultimately, it works and can be used.


While quotemstr's reaction is over the top, I do see the need to have a memory allocation approach that can handle OOM gracefully. Many types of software that could benefit from Rust's compile-time safety will want to allocate right up to the limit of available memory, such as audio/video processing software where more memory equals more simultaneous effects and less I/O.

I am endlessly frustrated by poorly designed audio software that aborts without saving if an OOM occurs. At the very least, a process should have the oppprtunity to save its state to disk, or ideally continue operating at a reduced capacity (e.g. a video codec might use fewer reference frames) after freeing some resources.


> I do see the need to have a memory allocation approach that can handle OOM gracefully.

Do you find anything wrong with inserting an allocator that panics on OOM (IIRC the default one aborts on OOM) and using `std::panic::recover` to catch the panic? This is the same as throwing and catching an exception. Note that `recover()` is designed to be exception safe by default.

(There soon will be a way to make std heap APIs like box and vec use Result, which might be neater)


I don't myself, no, but I haven't dived into Rust yet. I'm assuming that recovery can happen at a point where data can still be saved.


Yeah.


> Sure, but if linux has this issue, then C++ programs on linux will also have this issue

Why even bring Linux into the discussion? A Rust program running on Windows has the same problem.

> No. My point was simply that no language has a complete solution to this problem.

A correct C++ program running on Windows will not spuriously abort. Neither will a C++ program running on a Linux system configured not to overcommit. That some Linux systems can be configured to kill processes at arbitrary times is not an excuse for Rust to be sloppy with memory allocation.

C++ and many other languages do, in fact, have complete solutions to this issue, and that Rust does not is a serious deficiency, one serious enough to prompt me to prefer other languages despite Rust's other advantages.


Can you point to an example of such a "correct C++ program" that is larger than some sample code or a toy program demonstrating this technique?

I'm just wondering if this argument is all hypothetical, or if there are any teams of C++ programmers who are actually disciplined enough to be able to handle this case in practice, in large scale software. I know that in most code that I've seen, the only use of std::bad_alloc has been to log the error and abort.


...and the same is true of Rust, just with different defaults. Switch allocation failure from an abort to a panic and catch the panic just like you would in C++.

What's incomplete about that? It's not like C++ can't be switched in the other direction with -fno-exceptions or equivalent.


-fno-exceptions is not part of C++. C++ is not whatever GCC and Clang happen to accept.


> On Linux malloc never fails actually.

Not true. For example, it can fail if there is no big enough chunk of virtual address space available (i.e. your heap is fragmented enough and your attempted allocation is big enough). I've even seen 64-bit processes manage to do this, my mmapping lots of multi-GB things at once and then trying to do large allocations.


Overcommit will fail if you go over the overcommit ratio. In addition, the process will die if you start using the fake pages the kernel gave you. But yes, malloc can fail on Linux.


You're correct in the way that it's default behavior, but the behavior can be turned off.

Also, if you're using cgroups (or anything that leverages that like containers) you can put a limit on the memory resource and then malloc will fail. Which is common enough, and only becoming more common as people are collocating a lot of disparate workloads on nodes (using the kurbenets scheduler).


No custom allocator can make a task that fails to allocate gracefully report an error. Rust's error handling design is just terrible, and mostly a consequence of eschewing exceptions.

Had Rust opted for exceptions, it'd be a much better, and actually usable, language. Rust's terribly error-handling strategy is the chief reason not to use it.


> No custom allocator can make a task that fails to allocate gracefully report an error.

Yes, you can. You can panic the thread, and you can recover from panics. But honestly this is never enough to gracefully recover from OOM. I can't think of any software that uses exceptions to gracefully recover from OOM (i.e. without crashing the process) in a way that works. How many Java applications do you see that catch OutOfMemoryError on a fine-grained level?

> Had Rust opted for exceptions, it'd be a much better, and actually usable, language. Rust's terribly error-handling strategy is the chief reason not to use it.

It's hyperbolic to call Rust's error handling strategy not "actually usable". I use it every day and never have any issues with it.


> How many Java applications do you see that catch OutOfMemoryError on a fine-grained level?

I wrote exactly that sort of code, in Java, last week.


I believe you that you do. That doesn't change the fact that few people actually do it or need to do it. There are many more people who think they need to recover from OOM but actually don't and would be better off if they didn't try. (There have been many security vulnerabilities that have resulted from attempting to handle OOM gracefully that wouldn't have been an issue if malloc just aborted the process.)


> That doesn't change the fact that few people actually do it or need to do it.

I am one of those few people. We are running some scientific code (currently, unfortunately, written in C++) on a heterogeneous bunch of compute nodes. Some computations can be extremely memory-intensive, and sometimes in ways that we didn't predict. So it's useful to be able to fail gracefully and record that computation X on node Y with input parameters [Z] failed specifically due to running out of memory at step W - so that e.g. the queue manager can try relaunching the computation on a beefier node or adjusting how many instances of which computation are allowed on Y.


That's something that Rust fully supports with panic handlers. Doing arbitrary work before the process goes down is useful and supported. (But you will have to be running Linux in a non-default configuration for it to be reliable, of course!)


Thanks, that's good to know!

> But you will have to be running Linux in a non-default configuration for it to be reliable, of course!

Of course. It does seem to work in practice with our C++ code, although probably that's due to our usage pattern.


I agree with you, but a general-purpose systems programming language needs to let _me_ make that determination. It can't abort on my behalf for my own good. It's depressing that Java, of all things, does a better job in this respect than Rust.

And we probably shouldn't be writing so much software in general-purpose systems programming languages.


> I agree with you, but a general-purpose systems programming language needs to let _me_ make that determination. It can't abort on my behalf for my own good.

You can decide. You can use the standard library and deal with exceptions, or you can not use the standard library and deal with malloc failure yourself. The Rust standard library is opinionated in this regard, because it's rarely ever a good idea to try to recover from malloc failure for userland processes.

That said, with recover, which will probably be stabilized, you can recover from malloc problems, which are turned into panics. But I'm sure you know that this can be unreliable on Linux with the default overcommit turned on, and so forth. https://doc.rust-lang.org/std/panic/fn.recover.html

Note all of the debate on the linked issue as to whether recover is a good idea. Most of the Rust community is very hesitant to even allow catching panics at all; they certainly don't find the current situation "unusable".

> It's depressing that Java, of all things, does a better job in this respect than Rust.

I think that Java shouldn't throw exceptions on OOM. It should just abort the process.


I profoundly disagree with your assertions about the correct way to handle malloc failure. While abort may be acceptable for some specific applications, general-purpose systems don't get to impose that opinion on programmers. Memory is a just another resource, and programs need to deal with resource exhaustion generally. Do you think programs should abort when the disk fills up?


Most programs do not need to deal with memory exhaustion. There's often little that can be done other than terminating anyway, many OS configurations remove your ability to effectively recover from it (overcommit and swapping making your app unusably slow so you would better off terminating), and adding rarely tested code paths is a good way to introduce bugs and vulnerabilities.

Programs should abort when the disk fills up due to swap exhaustion, yes. They shouldn't abort if I/O fails, but that's because (a) I/O failure potential occurs in many fewer places than memory allocation failure potential, so it's easier to test; (b) I/O failure can occur for many reasons other than disk space exhaustion, and it's usually fine to handle disk space exhaustion the same way you handle other types of I/O failure, so it isn't any extra burden to handle that case.


> Most programs do not need to deal with memory exhaustion.

You keep making this assertion, but it doesn't appear to be true. There are several examples are on this subthread. I don't think you're justified in treating memory and disk space separately. The concerns that apply to one apply to the other. I know you cite relative frequency of failure as a reason to distinguish, but I don't buy it, because it's not a qualitative difference. Resource exhaustion is resource exhaustion.


> and mostly a consequence of eschewing exceptions.

A panic is the same thing as an exception. If you want to catch a panic, use recover(), it's meant to be used exactly for these end-of-the-world panic scenarios (and for FFI/etc).

You can plug in a custom allocator that panics on OOM (I think the standard one aborts).

As Steve mentioned, custom allocators can mean two things. The type that exists in rust today is one where you can make OOM panic, but not have allocation methods return Result. A planned extension will let you have allocators with different semantics entirely work with stdlib types (via defaulted type parameters); and this will let you use regular error handling with stdlib types too.


> A planned extension will let you have allocators with different semantics entirely work with stdlib types

And what about all the code that doesn't? It's because so much code exists that's completely oblivious to the possibility of these stdlib functions failing that I don't think that merely adding the option to do the right thing is good enough. The existing failure-oblivious APIs need to be explicitly deprecated.

The only ways to redeem Rust is to either support exceptions as first-class citizens with mandatory runtime support or to convert all existing allocating stdlib functions to return Result and mark all the existing failure-oblivious ones as being as deprecated as gets(3) in C.


> The existing failure-oblivious APIs need to be explicitly deprecated.

That's total overkill. For 99% of applications, process abort is fine, and dealing with it is just noise. Those 1% are usually things like kernels that use custom standard libraries anyway.

We're not doing the 99% a favor by making them think about OOM every time they do something that might allocate.

> The only ways to redeem Rust is to either support exceptions as first-class citizens with mandatory runtime support or to convert all existing allocating stdlib functions to return Result and mark all the existing failure-oblivious ones as being as deprecated as gets(3) in C.

This is silly hyperbole. Ask anyone who works in security whether the danger of xmalloc() is comparable to the danger of gets(). In fact, I've seen many security folks recommend only using xmalloc() with process abort instead of trying to explicitly handle OOM failures!


Even C++ doesn't have mandatory exception support, and even Rust can catch panics from failure-oblivious code.


> Even C++ doesn't have mandatory exception support

Yes it does. That some compilers provide a way to disable mandatory language features is no argument.

> even Rust can catch panics from failure-oblivious code.

Not while maintaining that code's invariants it can't.


> Yes it does. That some compilers provide a way to disable mandatory language features is no argument.

It's actually very relevant that huge amounts of C++ deployed in the world use -fno-exceptions, and many shops (for example, Google!) have a policy of "we do not use exceptions". I don't care about how well languages handle OOM in theory; what matters is how well they handle it in practice.


> for example, Google!

Google's C++ coding standards have done tremendous harm to the C++ community by perpetuating obsolete programming practices like two-phase initialization and lossy error reporting. Google's C++ standards also teach people that it's okay to use the STL and not worry about allocation failure, which hurts program robustness generally.

I'm not the only one who thinks so: see https://www.linkedin.com/pulse/20140503193653-3046051-why-go...

My C++ code is exceptional, modern, and robust, and anyone using -fno-exceptions can go fly a kite.


I think it's hard to argue that Google is in the wrong by not wanting to rely on std::bad_alloc for dealing with OOM.

> Google's C++ standards also teach people that it's okay to use the STL and not worry about allocation failure, which hurts program robustness generally.

Actually, I think making std::bad_alloc call std::terminate improves program robustness by a lot over trying to gracefully recover from all allocation failure. Certainly it reduces security vulnerabilities.


> Certainly it reduces security vulnerabilities.

So does the power button. You can't get away with justifying breaking arbitrary functionality in the name of security.


Are you saying C++ make it easy to write exception-safe code? Because Rust explicitly encodes exception safety into the type system with the RecoverySafe trait, you need to write unsafe code to bypass that, and the documentation on unsafe explicitly covers exception safety.


Rust doesn't consider exception safety to be a matter worth 'unsafe's time. All code must simply be memory-safe in the face of unwinding. RecoverySafe is basically "it's hard to leak busted state out of a region of code that panicked". That is, mutable references aren't RecoverySafe, and mutexes and the like poison their contents if they witness a panic while locked.

But RecoverySafe is just preventing things like "your binary heap was only partially heapified" and not "your heap is now full of uninitialized memory". You can get poisoned values out of mutexes just fine, so everything needs to put itself in a memory-safe state if a panic occurs.

One can bypass RecoverySafe in safe code with the AssertRecoverySafe wrapper.

It does however turn out that safe code in Rust is generally quite exception-safe by default. This is because safe code can't do anything too dangerous, panics are generally only caught at thread or application boundaries (so data that witnesses a panic is usually well-isolated) and there's way less places that can unwind compared to "override everything" C++. But exception safety is indeed something unsafe code needs to fight for (see the aforementioned binary heap in std).


Rust's type system doesn't attempt to guard against resource leaks.


It doesn't guarantee destructors will run, that's true, but that's for things like Rc cycles. Take a look at the RFC for std::panic::recover- it definitely takes exception safety into account: https://github.com/rust-lang/rfcs/blob/master/text/1236-stab...

Also take a look at things like the design of the Drain iterator- the stdlib is definitely (intended to be) exception safe.


  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
So, first of all, "custom allocators" means two things:

  * overloading the allocator that's used by liballoc, and
    the crates that depend on it, like libstd
  * other allocators entirely
The first is described here: https://doc.rust-lang.org/book/custom-allocators.html

And the second is still in RFCs: https://github.com/rust-lang/rfcs/pull/1398

Both of these things are not yet stable. The second does, in fact, give you the ability to return an error code, by returning a Result.

However, on top of that, I don't see how

  >  mostly a consequence of eschewing exceptions.
and

  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
Work together. Or rather, why is panic-ing bad, but an exception good?


> why is panic-ing bad, but an exception good?

Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

Even if abort-on-panic were to be killed as a legal mode of operation, and even if the stigma were to be removed from std::panic::recover, we'd still be left with a language with two error handling strategies and endless programmer confusion over which to use.

Rust's designers have done permanent damage to the language by not making exceptions the primary error reporting mechanism available to programmers, and it's not a mistake they can undo now.


> Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

recover() exists. You're right, there's a stigma to it, because you're not supposed to use it unless you really need to (hence, no programmer confusion). It's supposed to be used for situations like:

- Catching panics before crossing an FFI boundary

- End-of-the-world situations like OOM where you want to still handle it somehow

- Ensuring that applications can recover from internal panics in libraries (though there should be little to no panics in the libraries anyway)

The stigma for recover is for using it where you're not supposed to; as a substitute for regular error handling. In this situation, you are supposed to, so the stigma doesn't apply.

The fact that it's not a first-class primitive seems mostly irrelevant to me. Rust does a lot of things in library functions and types, even our concurrency safety mechanisms are something that can be duplicated in a library. As long as it can be used, what does it matter?

The fact that you can set the panic handler at runtime is also irrelevant. If you want to catch panics, don't do that.


The problem with the dualistic error handling strategy you're proposing is that the "severe" path gets even less testing than normal error recovery schemes do. Imagine you're working with a big non-exceptional C++ codebase (e.g., Firefox) and somebody throws std::bad_alloc. Even if you don't abort immediately and let the exception unwind the stack, the unwinding process will still leave lots of invariants broken, since all the cleanup paths are wired to return codes and will not run on unwinding.

The result is that your program can be almost arbitrarily broken after throwing. You might as well have just called longjmp.

It's because unwinding in only rare cases often produces bad results that I favor making unwinding the only error-reporting machinery in a language. If you use exceptions to report all errors, everyone starts caring about exception safety again.


Note that recover() uses Rust's type system to enforce certain things about exception safety. It's harder to mess up, even if libraries are written without unwinding in mind.


Exceptions can be turned into aborts in C++ as well, and are in many types of programs, because exceptions do have downsides for some problem domains. If Rust forced exceptions on everyone, there'd be people complaining about that just like you're complaining now.

I see the split between `Result` and `panic!` as more like Java's split between checked and runtime exceptions, except `Result` is much more usable than checked exceptions because it's part of the main data flow path, and so can use method chaining combinators instead of unwieldy try/catch blocks. OOM in Java is, like in Rust, not a checked exception, because it's not something you'd want to handle everywhere it can happen, but rather something to propagate up the stack transparently.


> Exceptions can be turned into aborts in C++ as well,

No you can't. -fno-exceptions does not appear in the C++ standard. You can write a compiler for any language. C++-that-aborts-on-throw is not C++, although, sure, it's closely related.

The ability to turn off C++ exceptions was a temporary workaround for compiler deficiencies in the 1990s that snowballed into an extremely harmful schism that's still doing tremendous damage to the C++ community.

The difference between -fno-exceptions and Rust's abort-on-panic is that the former is an unofficial, disgusting hack, while the latter is getting full official support for some reason.


That's not a very meaningful distinction to make- Rust doesn't even have a standard right now. Besides, -fno-exceptions is quite useful today, not just because of 90s compiler deficiencies, and is pretty well-supported by compilers.


The existence of -fno-exceptions means that library authors either using the language as intended, and accept losing a portion of their potential user base, or write less-than-optimally elegant and clear code, which punishes everyone, so a few can turn off a core feature of the language. It fragments the community.


This is an area where you just can't actually please everyone. I have heard the same opinions you've expressed in this thread, just as strongly, for even including unwinding at all. That aborts should be the only option, and that the cost of unwinding is far too high to be included in a true systems language.

Language design is tough. I'm glad we have multiple languages.


It's _because_ Rust tried to please everyone that it painted itself into this corner. If the exception people had won, life would have been great.

But if the error-code people had won, then life would still be good, because then Rust's stdlib might have been a bit uglier, but it would at least be correct with respect to error propagation. It's because Rust tries to satisfy both camps --- because it tries to give you the concision of exception code and, er, the lack of actual exceptions --- that it's forced into the terrible position of needing to abort internally on error, lacking a way to report errors to higher level code.

The lesson here is that optimizing for happiness and harmony leads to bad design.


I prefer "taking all use-cases seriously instead of abandoning a segment of users" to "happiness and harmony," as a characterization here. If serious use cases were not presented for both options, we would have enforced one. Or, if Rus weren't a systems language, we could have enforced one.

At the end of the day, if you have exceptions, you can still call abort in your exception handlers, so the split exists regardless. And without first-class support, those users are paying for a feature that they aren't using, which is against a core value of Rust.


You are arguing for replacing bad behavior "abort on OOM" with something even worse, exceptions. I honestly don't think you know what exceptions entail wrt what compilers do and the resulting bloat.


What, unwind tables? The ones that go untouched in normal operation? They're hardly catastrophic, and you need unwind support as a mandatory part of some ABIs in the first place. I know perfectly well what exceptions entail, and I maintain they're vastly better than other error handling strategies. You're the one who doesn't know what he's talking about.


You do understand it's usually in the ballpark of +20% or more to text or so that is in the loaded part of the program right? Also define mandatory, what requires eh_frame..?


The Windows 64-bit ABI, for starters. The world is not Unix. As for additional text: compare that with the code spent on explicit checks of error values. You can't just enable exceptions on an existing codebase and point to how awful they are without account for the now-extraneous code that exception support allows you to remove.


the x86_64 psabi also requires eh_frame but it's actually not used for much except exceptions, stack unwinding, etc thus is mostly useless. Of course all of this is sort of moot with rust as rust requires eh_frame.. and the resulting bloat.


There is an exception-like mechanism in Rust, in the form of the "try!" macro. It's a lot more flexible, but somewhat more verbose (Haskell has the same mechanism in a way that looks a lot more like exceptions, so that's not an inherent flaw). The best explanation I've seen is this:

http://www.jonathanturner.org/2015/11/learning-to-try-things...

tl;dr: "Result"s are like exceptions which are caught by default. You can (explicitly) propagate them upwards by using try!(...). This is nice because it means that you can tell what exceptions can occur in a block of code only using "local" information.


> There is an exception-like mechanism in Rust, in the form of the "try!" macro.

Correct. That's not the problem. If Rust's standard library returned Result in all cases where allocation could fail, I'd be satisfied. My primary issue is that they didn't, because Result is awkward.

Rust's designers went wrong in trying to have their cake and eat it too. They wanted to avoid exceptions and not make people care locally about errors. That's why they assert that errors just don't happen and abort if they do.

Throwing exceptions is a reasonable design choice. Returning error codes is a reasonable design choice. Pretending errors don't exist is not.


> Pretending errors don't exist is not.

We don't and we never have.


I don't think that there's any guarantee in Rust that malloc failure will abort rather than panic. That just happens to be the current implementation. I'm not sure I've ever heard of anyone running into that being an issue in practice, as opposed to this kind of abstract discussion. But I think that it wouldn't be considered a breaking change to switch from aborting to panicking if there were any kind of demand for it.

In Rust, exceptions (panic) are used for truly exceptional situations, like programmer error (indexing beyond the end of an array, division by zero) or things that practically are not expected to happen in a recoverable way in the course of ordinary use, like malloc failure. On modern virtual memory operating systems, malloc failure is so unlikely, and in application code there's so little you could reasonably do if it happened, that it is considered be a truly exceptional case.

On the other hand, Result is used for those kinds of errors that are expected to happen in practice even with working code on reasonable systems. IO errors, errors decoding UTF-8, etc.

Right now, catching exceptions (panics) using recover() is still considered unstable. There is some work ongoing to try and work out the API to help ensure safety, by marking types based on whether they are exception-safe or not; so you can use recover() with types that are built in an exception-safe way, or you can wrap types in AssertRecoverSafe to assert that you are providing exception-safety guarantees yourself, but you can't just arbitrarily recover from panics in code that has access to arbitrary data without someone having added an annotation somewhere that they believe that the code is exception-safe. https://github.com/rust-lang/rust/issues/27719 Note that based on the latest discussion, recover() will likely be named something else involving "unwind" to be more explicit about what it's doing.

And exception safety is quite important to the Rust authors. Note that Mutex has a built-in exception safety mechanism, poisoning the mutex on panic so that other users can't accidentally access the protected resource without being aware that another thread panicked while holding it.

Now, there are times when handling memory allocation failures properly is more important, such as in embedded systems or in operating system kernels, where you don't have a virtual memory abstraction with over-provisioning. However, in those cases you couldn't use the standard library anyhow, as the standard library depends on OS support; so you might as well use alternate data types that do return Result on allocating operations.

I'm just not sure about the utility of providing a convenient way to recover from malloc failure in applications running on virtual-memory operating systems. Can you show me an example in C++ (or any other language) where this is handled properly in application code in any way that doesn't simply log and abort, in which all unwinding code in the same application also avoids allocation as it may occur while unwinding from an allocation failure, and in which these code paths are actually tested in the test suite to ensure they behave properly?


> Right now, catching exceptions (panics) using recover() is still considered unstable. ... you can't just arbitrarily recover from panics in code that has access to arbitrary data without someone having added an annotation somewhere that they believe that the code is exception-safe

And it's for this reason that I don't think I'll be choosing Rust for any of my projects in the near future. This cavalier attitude toward memory exhaustion is not only concerning itself, but also makes me doubt the robustness and design principles of the rest of the system.

Besides, if you make exception-safe code difficult to write, nobody in practice will write it, so you'll end up with a system that's tantamount to one that just aborts. Saying that "Rust the language handled OOM just fine without stdlib!" and "we can convert OOM to panic!" is useless when these measures don't help real world code.

> In Rust, exceptions (panic) are used for truly exceptional situations

I've never accepted the argument that we need to use one error-recovery scheme for "normal" errors and another for "exceptional" ones. That kind of claim sounds reasonable, sober, and measured, but it leads to bad outcomes in every system I've seen, because the "exceptional" case in practice becomes a hard abort. A unified error handling scheme is a boon because it greatly simplified the cognitive analysis of errors.

Java is a good example of how to do right-ish. Serious errors are Throwables not derived from Exception, so normal catch blocks are unlikely to catch them. But serious errors are still exceptions (if not Exception), and all the usual language features for processing exceptions, including unwinding, stack trace recording, and chaining, operate normally.

Uniformity of error processing in Java is a great feature, and the language gets it without sacrificing the ability to distinguish between serious and expected errors. Now, I'm not arguing that Rust get checked exceptions, but I do have to insist that experience shows that you don't need two completely different error handling mechanisms (say, panic and Result) to mark problem severity.

> But I think that it wouldn't be considered a breaking change to switch from aborting to panicking if there were any kind of demand for it.

I'm not comfortable to casual changes in core runtime semantics.

> On modern virtual memory operating systems,

Are you just defining "modern" as "overcommit"? People (especially from the GNU/Linux world) constantly assert that allocation failure is rare, but I've seen allocations fail plenty of times, due to both address space exhaustion and global memory exhaustion. I don't have any firm numbers, but I haven't seen any from the abort-on-failure camp either.

> Can you show me an example in C++ (or any other language) where this is handled properly in application code in any way that doesn't simply log and abort, in which all unwinding code in the same application also avoids allocation as it may occur while unwinding from an allocation failure, and in which these code paths are actually tested in the test suite to ensure they behave properly?

SQLite [1] and NTFS [2] come to mind, as well as lots of tools I've discovered.

[1] https://www.sqlite.org/malloc.html

[2] guaranteed to make forward progress; pre-reserves all needed recovery resources; yes, I know NTFS runs in ring zero, but it's not the case that the kernel doesn't have to deal with dynamic memory allocation


  Besides, if you make exception-safe code difficult to 
  write, nobody in practice will write it, so you'll end up 
  with a system that's tantamount to one that just aborts. 
  Saying that "Rust the language handled OOM just fine 
  without stdlib!" and "we can convert OOM to panic!" is 
  useless when these measures don't help real world code.
I'm not sure where you get the "difficult to write" part from. It's no more or less difficult to write than in any other language, as far as I know; you just do have to go through the effort to indicate that "yes, I did think this through and believe this is exception safe" for types that you want to be able to use across an exception-catching boundary.

As I said, work is ongoing to determine if this AssertUnwindSafe approach is actually workable in practice. The initial implementation had some usability issues, but it looks like it may be more workable now that you can use it on the entire closure if you need to. It's still a speedbump, but a very minor one.

  That kind of claim sounds reasonable, sober, and 
  measured, but it leads to bad outcomes in every system 
  I've seen, because the "exceptional" case in practice 
  becomes a hard abort.
Can you point out what these bad outcomes or bad systems have been? I agree that in practice, the most common case is that the exceptional case becomes a hard abort, but I don't necessarily agree that that's a bad thing.

For people who are not trying to write extremely fault-tolerant code like SQLite, and going to great lengths to do so, that is a good thing; adding some half-assed normal error handling around these truly exceptional cases is more likely to lead to mistakes and problems down the line than just aborting is.

For people who are trying to write extremely robust, fault tolerant code, you can either handle panics, or avoid the standard library and do error handling via results. Both should approaches should be viable, depending on your requirements; the standard library does take exception safety into account, so it shouldn't on its own cause issues if you handle errors via panics.

  I'm not comfortable to casual changes in core runtime 
  semantics.
But you are comfortable with the sheer amount of undefined and unspecified behavior in C and C++? Remember, at the moment Rust only has a single implementation and no formal specification, while C and C++ have many different implementations, and the standards allow very wide amounts of leeway in how implementations differ.

Now, Rust not having a formal specification or multiple implementations is not a good thing; it's just a fact of life for a language that is not yet very mature. But I think that this particular behavior is something that should be considered similar to unspecified behavior at the moment. Just like out of memory situations or stack overflow behave differently on different platforms in C and C++ at the moment, how the Rust runtime behaves on out of memory could also be subject to change or different implementations. Given the standard library API, you couldn't return a result, but either aborting or panicking would both be consistent with the language as currently defined.

  People (especially from the GNU/Linux world) constantly 
  assert that allocation failure is rare
I'm not asserting that allocation failure is rare. Just that there are some cases where you don't have a chance to handle it at all, like GNU/Linux where you overcommit, and that handling it in any way other than abort is rare.

  SQLite [1] and NTFS [2] come to mind, as well as lots of 
  tools I've discovered.
Neither SQLite nor NTFS use exceptions, nor are they applications, so they aren't very good examples of applications using exception handling to deal with memory allocation failure.

SQLite is written in C, which doesn't have exceptions, nor a standard library similar to the C++ or Rust standard library. SQLite has had to implement all of their data structures by hand. You can do exactly the same in Rust by using #![no_std] and just using the core library, which only defines basic data types and never allocates.

NTFS is written in the NT kernel, which doesn't have support for exceptions either, nor does it use the C++ standard library.

So yes, you can actually write code that handles allocation failure properly. The examples you've given both eschew a high-level standard library, and instead implement all of their data structures and memory handling themselves, reporting errors by passing error values back. All of which you can do in Rust using #![no_std].

Meanwhile, there are lots of user-space applications that people use all the time which have no special handling for OOM situations; they rely on the OS to provide them with sufficient amounts of virtual memory, and either be killed by not handling an exception, aborting explicitly on getting NULL from malloc, or being killed by an OOM killer if they exceed the capacity of the machine and try to access an overcommitted page.

I'm sure there are some examples out there, somewhere, of user-space applications that actually do catch such issues, and attempt to do graceful cleanup. On the other hand, I don't know how successful they will be, especially if they have to be cross-platform; since any kind of cleanup you may do, such as writing state out to disk before dying, will hit the kernel's page cache, which may involve allocating memory, which may fail in such a situation, even if you do try to handle the issue gracefully in user-space you may not have anything you can do.


There's more to the world than end-user applications though. I think your mental model is that there are two kinds of Rust user: OS kernel writers and people who create applications with menu bars and save buttons.

What about network services that would rather begin failing requests on overload than shut down entirely and restart, incurring potentially big delays in the process? What about scientific computing projects that are happy delaying work once they've hit pre-defined limits? I think you're suffering from a failure of imagination.

If Rust's goal is to supplant C, it needs to be capable of everything C is capable of doing. Arguing that applications in general need X or Y is a canard, because most of those applications have no specific need of the kind of direct memory control that Rust affords.

To put it another way: who are you trying to satisfy? Are you trying to compete with Go, Nim, Python, and Java and provide high-level facilities that work most of the time, at the cost of control, or are you trying to compete with C and C++, which still fill an essential niche?

By appealing to arguments about the requirements of applications in general instead of requirements of systems programming languages specifically, you're suggesting that the former audience is the better bet.

That kind of targeting is sad, since one of the promises of Rust is that its memory safety would save us from the plague of security holes in low-level software. The decisions the Rust project is making right now make it less likely that Rust will be able to fully fill C and C++'s niche.

One of the purposes of having a standard library for a project is to be a universal resource for all users of a language. If Rust's standard library isn't suitable for all environments where Rust might be used (like C++'s standard library is), then maybe it should be packaged as a separate project, like Qt.


  There's more to the world than end-user applications 
  though. I think your mental model is that there are two 
  kinds of Rust user: OS kernel writers and people who 
  create applications with menu bars and save buttons.
Not at all. I myself work with more types of applications than that; I work with high-reliability networked daemons, GUI applications, and web applications.

  What about network services that would rather begin 
  failing requests on overload than shut down entirely and 
  restart, incurring potentially big delays in the 
  process?
High reliability network services generally need to be distributed across multiple machines anyhow, to provide reliability against the machine going down, so they have some notion of processes that can be stopped without shutting the whole system down. If your system can't handle one of the daemons being restarted, then it has bigger problems.

However, even for this case, you can handle OOM more gracefully if you change allocation failure to panic rather than abort (either by changing the default in Rust's standard library, or using a custom allocator). At that point, you can define a proper task boundary on which you catch unwinding, make sure that everything shared across that task boundary is exception safe, and recover gracefully.

  What about scientific computing projects that 
  are happy delaying work once they've hit pre-defined 
  limits? I think you're suffering from a failure of 
  imagination.
How many of these applications use malloc failure as their backpressure mechanism against over-allocation of resources? In general, I think they have a tendency to distribute small jobs across a large cluster, balancing them based on resource utilization, and accepting that some jobs may fail for various reasons with the ability to re-run said jobs if necessary.

  If Rust's goal is to supplant C, it needs to be capable 
  of everything C is capable of doing. Arguing that 
  applications in general need X or Y is a canard, because 
  most of those applications have no specific need of the 
  kind of direct memory control that Rust affords.
Rust's goal is not necessarily to supplant C or C++; they are far too widely used for that ever to be realistic.

The goal is to provide a reasonable, safe alternative, that offers better abstractions and greater safety, and can be used in situations that other high-level safe languages are unsuitable for.

As far as replacing C, Rust absolutely is capable of replacing C; just use #![no_core] and handle allocation failure however you want. C++'s standard library is more comparable to Rust's standard library.

  To put it another way: who are you trying to satisfy? Are 
  you trying to compete with Go, Nim, Python, and Java and 
  provide high-level facilities that work most of the time, 
  at the cost of control, or are you trying to compete with 
  C and C++, which still fill an essential niche?

  By appealing to arguments about the requirements of 
  applications in general instead of requirements of 
  systems programming languages specifically, you're 
  suggesting that the former audience is the better bet.
As an aside, when you say "you", it sounds like you may be addressing me as a member of the Rust team. I am not; I am a user of Rust, and have contributed a few small patches, but I am only speaking for myself and not anyone else.

Rust is a general purpose programming language, that is designed to appeal to a wide audience, but fill needs that other high-level languages cannot, and provide safety and abstraction that C or C++ cannot.

The first audience is likely a much larger audience, and so it is worth keeping their needs in mind, while the second audience can take the most advantage of Rust's safety and performance guarantees.

  That kind of targeting is sad, since one of the promises 
  of Rust is that its memory safety would save us from the 
  plague of security holes in low-level software. The 
  decisions the Rust project is making right now make it 
  less likely that Rust will be able to fully fill C and 
  C++'s niche.
There are many, many applications, including more than just GUI facing applications but also servers, high-performance computing, etc, written in C and C++ that do not, and do not or do not need to handle allocation failure explicitly. In fact, in this entire discussion, you still have not pointed to a single example of a C++ application that does anything other than abort on allocation failure.

However, even for applications that do not need to handle allocation failure, they would be able to take advantage of type safety, memory safety, and easy, safe concurrency. You are focusing on one, small issue, and ignoring the huge swath of other issues that you run into when writing C or C++ code that can go away by using Rust.

  One of the purposes of having a standard library for a 
  project is to be a universal resource for all users of a 
  language. If Rust's standard library isn't suitable for 
  all environments where Rust might be used (like C++'s 
  standard library is), then maybe it should be packaged as 
  a separate project, like Qt.
But C++'s standard library is not suitable for all environments in which it's used. Other examples that have already been brought up in this discussion are in kernels, embedded systems, in any code running at Google, and heck, as you mention there are third-party libraries like Qt that are widely used frequently to the exclusion of the standard library.

Something like C++ or Rust's standard library cannot be used in all situations, and even in places where it could run, no general purpose standard library is ever going to satisfy all users. What Rust aims to provide is one that works best, and most naturally, for a wide variety of use cases, which includes GUI applications, web apps, network daemons, and scientific application.

Since handling allocation failure as anything but an abort is so uncommon, it chooses to avoid either of the other two options: requiring everything that allocates to return a Result, making the interfaces to every collection type much more painful to use, or having pervasive exceptions and exception handling, meaning you need to think about exception safety everywhere.

The approach that Rust takes is a moderate approach; it uses return values for those errors that pretty much any user will have to handle, and panics for truly exceptional situations that normally should lead to an abort but which you can add special handling for at task boundaries if you need to provide higher availability, which means that you limit the number of places in which exception-safety needs to be considered to just those boundaries.

At the moment, it uses aborts for allocation failure, but there's nothing inherent to the language about that, just the current implementation.

I think the main point where our opinions diverge is that I see handling memory allocation failure with anything other than an abort as much, much more rare than the extremely common cases of exceptional situations leading to much worse results in C or C++. The sheer amount of undefined behavior, the mysterious bugs caused by buffer overruns overwriting random bits of the stack, the security vulnerabilities, the bugs caused by some undefined behavior you didn't realize was there causing the optimizer do do something strange to your code, and so on.

If allocation failure causing an abort when pushing to a Vec, unless you supply a custom allocater that panics instead and implement proper panic handling, is something that you think is fatal in terms of choosing a language, why is it not fatal that one single missed buffer length check buried in one library somewhere can cause completely unrelated parts of your application to fail mysteriously? As far as appropriateness for the kinds of projects you describe, other than the greater library and tool support due to being much more mature ecosystems, I can think of very few cases in which C or C++ would be preferable to Rust; so much of their behavior on unexpected situations is so much worse than an abort.


> If Rust's goal is to supplant C, it needs to be capable of everything C is capable of doing.

We have demonstrated this multiple times. You can either use your own stdlib like sqlite, or use recover. You may not like the solution, but the fact still remains that it still is a tangible solution (well, the latter one is -- "your own stdlib" is a pretty specialized solution which you shouldn't need) to the problem. Given that a solution exists, the only issue is with usability -- and you have to ask the question if there are any improvements to the OOM-handling API that can be made without burdening the users who don't care about OOM too much. There is one improvement which can be made that doesn't affect non-OOM users at all (custom allocators v2, which lets you use Rust error handling with stdlib heap types). This improvement is something the core team cares about and will probably happen (don't know about the time frame, since it handles a lot more things than just Resulty heap types). Other improvements will either mean having regular users check for null all the time, or make panics standard fare, neither of which are good ideas.

Please stop ignoring the fact that Rust does have a solution to the OOM problem; I'm tired of reiterating this argument. One can make arguments that it's much not as usable as C++ or C -- that's okay, but ignoring it entirely is just silly.

(As far as usability wrt C++ and C, I still don't see why it's less usable, C has the horrible check-every-time-or-else situation, and Rust's solution is more or less identical to C++ with the exception that it's the road less traveled on. Given that the API handles exception safety explicitly, this should not be that big a problem).

> you're suggesting that the former audience is the better bet

Not necessarily. The former audience encompasses the latter. Rust doesn't want to put undue burden on general users (like having to check all allocations or having to think about exception safety). That's a reasonable ask. It similarly doesn't want to put undue burden on systems users, and it doesn't -- not any more than C or C++. I don't think the Rust designers feel that they have, recover() is a pretty decent API with a lot of thought put into exception safety.

> If Rust's standard library isn't suitable for all environments where Rust might be used (like C++'s standard library is)

The reason #[no_std] was brought up was because you gave an example of sqlite, which does the same thing. It's meant to be used in certain situations in embedded programming or writing a kernel (note that Rust still has a "core std lib", called libcore, which is available and doesn't need malloc) where things like malloc may not even exist. Embedded programming in C++ does something similar.


You haven't changed my mind about Rust being unfit for purpose.

> Please stop ignoring the fact that Rust does have a solution to the OOM problem;

I disagree that what you're calling a solution is, in fact, a solution. It's more like defining away the problem. It's the case that most Rust programs, those that use stdlib, will never be able to rigorously respond to all allocation failures.

You don't get to wave away problems with Rust stdlib with appeals to an unhosted environment when C++'s stdlib doesn't have the problems I'm highlighting. There's no reason std::vector couldn't be used in a kernel --- just no history.

The SQLite criticism is not the point. The request was for a tested component that recovers from allocation failure. Now you're saying that this example isn't good enough because it's written in C. You're moving the goalposts.

I've already outlined what it would take for me to agree that Rust's OOM problem is solved. It looks like Rust is just adding a few ways of optionally doing more stringent checks, not actually propagating failure from core routines appropriately.

> Not necessarily. The former audience encompasses the latter. Rust doesn't want to put undue burden on general users (like having to check all allocations or having to think about exception safety)

Should these poor users get a pony too? Programming is about managing resources. I've outlined elsewhere the kind of trap you force yourself into when you simultaneously avoid both exceptions and error codes. By doing both, you're not making the world a simpler case. You're just hiding the nasty bits that can go wrong, and users deserve better.


> It's the case that most Rust programs, those that use stdlib, will never be able to rigorously respond to all allocation failures.

I'm not talking about using a different stdlib, I'm talking about recover().

> You don't get to wave away problems with Rust stdlib with appeals to an unhosted environment when C++'s stdlib doesn't have the problems I'm highlighting.

I didn't do that. I'm asserting that Rust's stdlib is appropriate for more or less all situations where you would use C++s stdlib. I have already explained why recover() should be adequate when you want to handle OOM, and recover() is part of the regular stdlib.

I was just putting the raison d'etre for no_std out there, and noting that the situations where you would use it in Rust exist in C++ too. I was trying to dispel the argument that "no_std exists in Rust, hence the stdlib isn't appropriate for all use cases, hence it shouldn't be part of the distribution", which you might have been making in the grandparent comment (I'm not sure if you were).

> The SQLite criticism is not the point. The request was for a tested component that recovers from allocation failure. Now you're saying that this example isn't good enough because it's written in C. You're moving the goalposts.

Fair. I'm not the one who made the original request, so I forgot about that.

> It looks like Rust is just adding a few ways of optionally doing more stringent checks, not actually propagating failure from core routines appropriately.

I'm not sure what you mean here.

Rust already has the ability to catch all panics and handle OOMs at an abstraction boundary of your choice as a global solution, similar to exceptions in C++.

Rust is getting the ability to do fine-grained C-like (or "C++ with try/catch around every `new`" -like) allocation failure handling in custom allocators v2, which can also tie in with your regular error propagation machinery.

> I've already outlined what it would take for me to agree that Rust's OOM problem is solved.

You really haven't. You've just attacked Rust's lack of exceptions incessantly without much arguments to back it up. You've not mentioned why recover() (given that it has exception safety built in and exception safety was a first-class concern during its design) is inadequate.

> you simultaneously avoid both exceptions and error codes.

Rust's Result type is basically a safer and more robust error code. Custom allocators v2 gets you error-code-like allocation that can tie in with your regular error handling.

(FWIW you can do errno-like error handling of OOM using the current support for custom allocators already, though making this safe might be tricky)


There was a really interesting article on error handling in languages recently:

http://joeduffyblog.com/2016/02/07/the-error-model/

It makes the case that you do in fact want two different error handling mechanisms, because there are two quite different kinds of errors. The author argues that running out of memory is most practically treated as an unrecoverable error which aborts the process.


That's not a property of rust the language, but the standard library. I believe you could use other allocators that behave differently. In any case, the standard library panics on OOM, and panics are described at the bottom of the linked page.


calls abort(), not panic!(). This is important since unwinding does not happen with the former.


Malloc never fails, but you might die if you touch the memory. In general, modern OSes don't have a good story about exhausting available memory beyond "let's kill a bunch of processes to free up memory".


> Malloc never fails

malloc can fail, even on default linux (overcommit enabled), if you go above the process's vmem limit for instance (because 32b or rlimited). And of course not all OS overcommit, Windows famously does not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: