No idea why Thaxll and the other comments are mentioning mutexes.
The equivalent (*) API to this Java API in Rust does exist; it's `String::from_utf8(Vec<u8>) -> String`. And the bug in TFA does not exist there. Since the signature consumes the `Vec<u8>` it's impossible for the caller or any other code to still have access to it to be able to modify it concurrently.
Also consider the similar API `str::from_utf8(&[u8]) -> &str`. The bug in TFA does not exist here either. Since the signature takes a `&` borrow of the slice, it is not possible for anything else in the program to have a `&mut` borrow of that slice to be able to modify it concurrently. After the function returns other parts of the program could mutate the slice, but they would only be able to do after the `&str` that is derived from the slice is dropped. So once again nothing would be able to mutate the slice and observe the effects in the `&str` itself.
All these "unable to do" are enforced at compile-time, because "consuming a value makes it unavailable to other parts of the program" and "cannot get a `&mut` to a value as long as a `&` from that value is still in scope" are all typesystem concepts. No mutexes or other runtime checks are involved.
(*) "Equivalent" in that it's an API to convert a sequence of bytes into a string. The Rust API doesn't have the encoding thing of the Java API because the Rust String / str are required to be utf-8 internally. But if an exact equivalent of the Java API did exist in Rust, the signature would still be the same wrt consuming `Vec<u8>` / borrowing `[u8]`, so it doesn't change the overall point re: concurrent modification. Furthermore, concurrent modification would cause problems even with Rust Strings if it was possible, because it would allow a String / str to become invalid utf-8 after they'd already been checked to be valid utf-8, which Rust considers to be UB.
> No idea why Thaxll and the other comments are mentioning mutexes.
Thaxll mentioned mutexes in a reply to the statement
Java has no way to express the concept of "something that nothing else can modify while I'm looking at it"
Even ignoring the performance aspect that is not the perfect answer, though. AFAIK, the JVM doesn’t have a notion of “you can only modify foo if you hold mutex bar”. That remains something the programmer must enforce.
The scenario I was imagining and commenting on was about “implementing a JVM with Java’s semantics in Rust”. Of course if we limit the language itself to safe Rust, we get data race freedom, but at a quite significant price for a high level language (it constraints possibly correct programs down a lot). But Rust would not help with relation to the primitives here at all (implemented in C++/Java).
Because in case of the problem at hand, this is a complex interplay between Java's standard library's Java code and the underlying JVM. There is not much to discuss regarding "rust would make the code safe", because so does JS as it is single threaded.. That's hardly interesting.
If we put Java on top of Rust, then no, Rust no longer can help about this. That was my whole point.
The problem here is that we don't want a mutex. Once you have it the performance cost would apply in runtime. In fact, to write this code in rust you would need to write unsafe code to get around the problem where Rust forces you to write correct but inefficient code.
This code is intentionally not thread-safe. This isn't so much a bug but an interesting thought experiment.
The equivalent (*) API to this Java API in Rust does exist; it's `String::from_utf8(Vec<u8>) -> String`. And the bug in TFA does not exist there. Since the signature consumes the `Vec<u8>` it's impossible for the caller or any other code to still have access to it to be able to modify it concurrently.
Also consider the similar API `str::from_utf8(&[u8]) -> &str`. The bug in TFA does not exist here either. Since the signature takes a `&` borrow of the slice, it is not possible for anything else in the program to have a `&mut` borrow of that slice to be able to modify it concurrently. After the function returns other parts of the program could mutate the slice, but they would only be able to do after the `&str` that is derived from the slice is dropped. So once again nothing would be able to mutate the slice and observe the effects in the `&str` itself.
All these "unable to do" are enforced at compile-time, because "consuming a value makes it unavailable to other parts of the program" and "cannot get a `&mut` to a value as long as a `&` from that value is still in scope" are all typesystem concepts. No mutexes or other runtime checks are involved.
(*) "Equivalent" in that it's an API to convert a sequence of bytes into a string. The Rust API doesn't have the encoding thing of the Java API because the Rust String / str are required to be utf-8 internally. But if an exact equivalent of the Java API did exist in Rust, the signature would still be the same wrt consuming `Vec<u8>` / borrowing `[u8]`, so it doesn't change the overall point re: concurrent modification. Furthermore, concurrent modification would cause problems even with Rust Strings if it was possible, because it would allow a String / str to become invalid utf-8 after they'd already been checked to be valid utf-8, which Rust considers to be UB.