Right, but in rust, not using one is a compile time error. In Java (as you can s...

pjmlp · on July 12, 2023

Only for in-memory data structures under Rust's control, if it is related to OS IPC, Rust cannot do anything.

jjnoakes · on July 12, 2023

This isn't true.

kaba0 · on July 12, 2023

This is a heavily optimized system library - you don’t use mutexes here. Rust wouldn’t help here, if mutexes would be fine, they would have been used. Especially that this is the result of C++ and Java code simultaneously.

Hell, it’s probably one area where rust’s benefits are a “hard sell” — you would have to constantly be in unsafe rust manipulating pointers manually as the compiler can’t reason statically about what a layer built on top does without a huge runtime cost (huge, as in you really don’t want to lock/unlock, or even refcount in these hot paths).

Arnavion · on July 12, 2023

No idea why Thaxll and the other comments are mentioning mutexes.

The equivalent (*) API to this Java API in Rust does exist; it's `String::from_utf8(Vec<u8>) -> String`. And the bug in TFA does not exist there. Since the signature consumes the `Vec<u8>` it's impossible for the caller or any other code to still have access to it to be able to modify it concurrently.

Also consider the similar API `str::from_utf8(&[u8]) -> &str`. The bug in TFA does not exist here either. Since the signature takes a `&` borrow of the slice, it is not possible for anything else in the program to have a `&mut` borrow of that slice to be able to modify it concurrently. After the function returns other parts of the program could mutate the slice, but they would only be able to do after the `&str` that is derived from the slice is dropped. So once again nothing would be able to mutate the slice and observe the effects in the `&str` itself.

All these "unable to do" are enforced at compile-time, because "consuming a value makes it unavailable to other parts of the program" and "cannot get a `&mut` to a value as long as a `&` from that value is still in scope" are all typesystem concepts. No mutexes or other runtime checks are involved.

(*) "Equivalent" in that it's an API to convert a sequence of bytes into a string. The Rust API doesn't have the encoding thing of the Java API because the Rust String / str are required to be utf-8 internally. But if an exact equivalent of the Java API did exist in Rust, the signature would still be the same wrt consuming `Vec<u8>` / borrowing `[u8]`, so it doesn't change the overall point re: concurrent modification. Furthermore, concurrent modification would cause problems even with Rust Strings if it was possible, because it would allow a String / str to become invalid utf-8 after they'd already been checked to be valid utf-8, which Rust considers to be UB.

Someone · on July 12, 2023

> No idea why Thaxll and the other comments are mentioning mutexes.

Thaxll mentioned mutexes in a reply to the statement

Java has no way to express the concept of "something that nothing else can modify while I'm looking at it"

Even ignoring the performance aspect that is not the perfect answer, though. AFAIK, the JVM doesn’t have a notion of “you can only modify foo if you hold mutex bar”. That remains something the programmer must enforce.

On the other hand, tooling exists to help them, for example https://www.javadoc.io/doc/com.google.code.findbugs/annotati...

kaba0 · on July 12, 2023

The scenario I was imagining and commenting on was about “implementing a JVM with Java’s semantics in Rust”. Of course if we limit the language itself to safe Rust, we get data race freedom, but at a quite significant price for a high level language (it constraints possibly correct programs down a lot). But Rust would not help with relation to the primitives here at all (implemented in C++/Java).

TheDong · on July 12, 2023

"Rust wouldn't help"

"This bug can't be implemented in rust"

"I meant that Rust doesn't fix the bug in Java. Even if you write rust code, you can also write buggy java code too so rust didn't fix the java code"

You're the only one here who thought "rust" meant "java semantics implemented in rust" in this context.

kaba0 · on July 12, 2023

Because in case of the problem at hand, this is a complex interplay between Java's standard library's Java code and the underlying JVM. There is not much to discuss regarding "rust would make the code safe", because so does JS as it is single threaded.. That's hardly interesting.

If we put Java on top of Rust, then no, Rust no longer can help about this. That was my whole point.

TheDong · on July 12, 2023

> That's hardly interesting

Rust and javascript having differences which prevent this class of bugs might not be very interesting, but it's more interesting than your point.

Unless I'm misunderstanding, your point is that a bug in Java cannot be avoided by switching languages to Java.

kaba0 · on July 12, 2023

No, my point is that changing the implementation language of java wouldn't have helped here.

invalidname · on July 12, 2023

The problem here is that we don't want a mutex. Once you have it the performance cost would apply in runtime. In fact, to write this code in rust you would need to write unsafe code to get around the problem where Rust forces you to write correct but inefficient code.

This code is intentionally not thread-safe. This isn't so much a bug but an interesting thought experiment.

Sharlin · on July 12, 2023

Rust absolutely helps here because in Rust it’s simply impossible for someone else to mutate something concurrently to you holding a reference to it. Code equivalent to that in the article simply won’t compile in Rust. This is, like, the very point of Rust’s borrow system. You can share, xor you can mutate, but not both at the same time. This holds equally for single and multi-threaded code.

ironmagma · on July 12, 2023

In safe Rust, that is. For unsafe Rust, I don't know exactly which bets are off but it's more than none.

masklinn · on July 12, 2023

In unsafe rust this is a concurrent modification of an object with shared references, which is an UB.

SpaghettiCthulu · on July 12, 2023

Unless everyone is just holding pointers

lostmsu · on July 12, 2023

Huh? This is exactly where Rust would help. In Rust the caller of the constructor would either have to add mutex if they needed concurrency, or just use the constructor without mutex overhead if they did not.

ironmagma · on July 12, 2023

It's a compile-time cost instead of a runtime cost.

gizmo686 · on July 12, 2023

99% of the time, the calling code trivially owns the array. If you are in a situation where the compiler cannot figure that out, then you need to deal with it regardless of what String does, because the exact same problem exists by the caller itself having a reference to the object.

j16sdiz · on July 12, 2023

Java compiler have a (not too bad) escape analysis engine. For something as low level as String intern/optimisation, it can be done in compile time