Hacker News new | past | comments | ask | show | jobs | submit login

That's a good subject. This Rust blog entry[1] is perhaps a better explanation.

When you pass data to another thread, there are three options: 1) pass a copy, 2) hand off ownership to the other thread, and 3) transfer ownership to a mutex object, then borrow it from the mutex object as needed. All of these are memory and race condition safe due to compile time checking.

Those are the concepts. The Rust syntax needed to support it is somewhat complicated, but if you get it wrong, you get compile time error messages.

This may be the biggest advance in software concurrency management since Dijkstra's P and V. Almost everything in wide use is either P and V under a different name, or some subset of P and V functionality. Locks are not a basic part of most languages; the language doesn't know what a lock is locking. The Ada rendezvous and Java synchronized objects are exceptions. Those were good ideas, but too restrictive. Finally, we're past that.

Go could have worked this way. Go originally claimed to be concurrency safe, but it's not. You can pass a reference across a channel, and now you're sharing an unlocked data object. This is easy to do by accident, because slices are references. Because Go is garbage collected, it's almost memory safe (there's a race condition around slice descriptors that can be exploited), but it doesn't protect the program's data against shared access. In Rust, when you pass an non-copyable object across a channel, the sender gives up the right to use it, and the compiler enforces that.

[1] http://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.ht...




(Note: The Rust blog post is more about how to use Rust's concurrency safety, mine focuses on the design and how it builds up to create a great system)

> This may be the biggest advance in software concurrency management since Dijkstra's P and V.

+1 . I'm not sure whose idea it was, but the idea behind the Send/Sync traits is brilliant. It's easy to say "we can mark a type as thread safe and non threadsafe". It takes some thought to come up with a design which works in a world of data sharing. Someone realized that thread safety was actually two, intertwined and interdependent concepts -- thread safe and share-safe[^1], and together they allowed for a good degree of safety without being too restrictive (unlike just having "non threadsafe" types).

[^1]: I am greatly oversimplifying the situation by calling them these names, but .. for a less simplified characterization, read the post :P


> When you pass data to another thread, there are three options: 1) pass a copy, 2) hand off ownership to the other thread, and 3) transfer ownership to a mutex object, then borrow it from the mutex object as needed. All of these are memory and race condition safe due to compile time checking.

There's more than those three: data can be shared via reference or reference counted pointers[Arc], which allows immutable access (more generally, concurrent access to any types for which this is safe, e.g. atomics and mutexes). A mutex is only necessary for mutating (nearly) arbitrary shared data. This is particularly powerful for literally zero-overhead read-only shared memory while still maintaining Rust's safety guarantees.

[Arc]: http://doc.rust-lang.org/std/sync/struct.Arc.html


That's not true, Rust permits opt-in data races in safe code.


If what you say is true, it's a bug: there should only be a risk of data races if `unsafe` is used. Do you have a code example?


I'm guessing he refers to atomics with relaxed memory ordering. That doesn't give much in terms of guarantees beyond atomicity and no "out-of-thin-air" values. I'm not sure of whether this is a data race under Rust's definition though.


The atomicity guarantees it's not a data race.


Rust doesn't get to redefine "data race".


It doesn't. We use the same definition of "data race" as tools like Thread Sanitizer and Eraser do.


From tsan documentation: "A data race occurs when two threads access the same variable concurrently and at least one of the accesses is write"


That's right, and Rust bans that. The definition of "concurrently" typically means "without synchronization in between". An atomic access is by definition a form of synchronization.


Relaxed access of atomics performs no synchronization.


"Relaxed atomic operations can be reordered in both directions, with respect to either ordinary memory operations or other relaxed atomic operations. But the requirement that updates must be observed in modification order disallows this if the two operations may apply to the same atomic object. (The same restriction applies to the one-way reordering of acquire/release atomic operations.)"

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n233...

There is a lot more explanation here that should cast some light on the situation: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n248...


It guarantees that the value read was some value that was in the variable at some point. e.g. it prevents a value that toggles from 3 to 5 being read as 117. It also prevents writes from being eaten (e.g. an increment always actually occurs).

This necessitates some level of synchronization.


(It also guarantees happens-before for a given thread, so if a thread changes it from three to 5 and other threads are not changing it back, that thread will never read a 3 again. This also needs some level of synchronization)


And the line right next to that:

> C++11 standard officially bans data races as undefined behavior.

Relaxed access of atomics is not UB in C++, that would be ridiculous. Clearly one of these two sentences is wrong or imprecise; I would bet it's the first one.


There is definitely something special going on with relaxed updates though. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n248...

Also https://news.ycombinator.com/item?id=9796245

I think your initial suspicion is right. The next section: "There are several possible definitions of a data race. Probably the most intuitive definition is that it occurs when two ordinary accesses to a scalar, at least one of which is a write, are performed simultaneously by different threads. Our definition is actually quite close to this, but varies in two ways:" ... "Instead of restricting simultaneous execution, we ask that conflicting accesses by different threads be ordered by happens-before. This is equivalent in simpler cases, but the definition based on simultaneous execution would be inappropriate for weakly ordered atomics." continues at: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n248...


From wikipedia:

    A race condition or race hazard is the behavior of an
    electronic, software or other system where the output is
    dependent on the sequence or timing of other
    uncontrollable events.
Example: https://gist.github.com/anonymous/eb72f1091bd1592df552

Output:

    $ for i in {0..10000}; do ./race; done | sort | uniq
    0
    1


"Data race" is a more specific term than "race condition."

See http://blog.regehr.org/archives/490 for more. But the TL;DR is:

A data race happens when there are two memory accesses in a program where both:

    * target the same location
    * are performed concurrently by two threads
    * are not reads
    * are not synchronization operations
I would argue you have a race condition but not a data race.


Ah, that's a race condition, something that Rust doesn't protect against in general (it's fairly difficult: whether a piece of non-determinism is a race condition depends on the semantics the programmer desires). A data race is more specialised, and we try to be careful to use "data race" when discussing this aspect of Rust.

http://blog.regehr.org/archives/490 has a fairly good discussion of the generally accepted definition of data race.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: