When I was learning Rust I leaned heavily on C-based analogies like this but I found they ultimately ended up being kinda harmful, because I was thinking (in C mode) "do I want to pass a pointer or a value to this function" rather than (in Rust mode) "do I pass ownership or a borrow to this function".
For an example of the difference, here's a function that consumes a 1kb buffer, returning the sum some of its contents. No pointers or references at all, everything "passed by value".
To my C programmer's eye this looks wrong, but from the output you see there is no memcpy. (Edit: this is wrong, both C and Rust do memcpy, see below comments, but I think my point stands.)
I'd be shocked to see a C programmer write the equivalent code ("passing by value a big struct to a function"), while in Rust it's not only idiomatic to write that code but frequently also necessary, depending on whether the function wants to take ownership its argument or some of its contents. (Imagine a more complicated 'stuff' struct that contained embedded pointers to other things.)
I'm a little embarrassed to say I ended up looking at the output on rust.godbolt.org a lot to convince myself that the bits of code I was writing was "ok", which is something I never did when learning e.g. Haskell or Go. I think the reason here is that Rust feels so close to C that there's a bit of an uncanny valley effect, where you can mostly pretend it's C up until you hit a wall.
This is wrong. Passing a large struct by value does introduce a memcpy... in the caller: https://godbolt.org/g/26annS
The same is true in Rust, though it's trickier to show on godbolt because the equivalent of `extern` is less accessible in that context.
Rust programs often put those large structs in Boxes when they need to transfer ownership of something large, which is something straight out of "C mode."
You are absolutely right. The Rust book makes it sound like that passing structs is free, but it’s absolutely not. It’s always bit-wise copy. Relevant Reddit and Stackiverflow discussion with godbolt examples:
The point I (poorly) was trying to make here was that passing things by value in Rust is fairly common, for e.g. Rc<T> or Option<T>, while in the C I've seen you basically never pass structs by value. But I guess, continuing the analogy, in C++ you do pass the equivalent of Rc<T> by value.
It's also common in C++ (at least in some coding styles) to pass large structs by const reference, creating the dilemma of when to pass by value and when to pass by reference, which has no perfect general solution.
In C you should be equally cognizant of whether you are passing ownership or a borrowed reference. The difference is that this is not encoded in the type system. There are conventions revolving around function names that can encode this.
Actually, &mut isn't equivalent to restrict, since two restrict pointers are allowed to alias in C as long as none of the intervening accesses are writes.
split_at_mut doesn't allow you to read through aliased "active" &mut pointers either (I would assume that restrict pointers that cannot be either read from or written to don't count).
Hm, in that case you don't need to appeal to split_at_mut--reborrowing is sufficient to demonstrate that point. I think what czwarich is saying is more subtle--that restrict allows "multiple readers, exclusive writer", which means that both shared pointers without UnsafeCell and mutable references in Rust are restrict in this sense, while &mut is more restrictive.
It’s true that &mut T is always unique; restrict says “this does not alias”, which &Ts trivially do. It goes on &mut T. For example, see https://github.com/rust-lang/rust/pull/31545, we had to remove this for a few releases due to an LLVM bug; it'll be back in the next release.
It’s phrased in the reference as
> &mut T and &T follow LLVM’s scoped noalias model, except if the &T contains an UnsafeCell.
“noalias” is restrict here. But it’s not on &T, it’s the system as a whole that follows the guarantees, that is, both with and without.
(And see the stuff on that page about how some of this is still in flux. And, maybe I’ve made some mistake with my words here; this stuff is hard and not what I deal with day to day...)
Right, so if &mut is always unique anyway then the presence of an UnsafeCell shouldn't affect it, since it's about weakening guarantees.
In fact, looking at the compiler output (https://godbolt.org/g/kt2xfW), it looks like only &T has noalias, and not &mut T. It does go away in the presence of UnsafeCell, but for &mut T it was never there in the first place.
So either that's a place where we can still add more noalias annotations, or it's off for some subtle reason I'm not familiar with.
Ah, right; missed that part of what you were saying.
If you switch the compiler version on my godbolt link to beta, the &mut i32 and &mut UnsafeCell<i32> both have the noalias attribute, so looks like it's on track. :)
Aside from the aspect of restrict pointers being a footgun for C programmers who aren't used to them and apart from being able to convert C to unsafe Rust with e.g. Corrode, what use cases (within the scope of things that one has to use unsafe Rust for as opposed to writing C-like code) are there for pointers that are allowed to alias?
Having unchecked pointers is important for Rust to be a systems programming language. But, since they're unchecked, they may alias. That's just the nature of not being able to make guarantees.
That doesn't explain it from the use case perspective. E.g. dereferencing a pointer with asterisk operator requires the pointer to be aligned, so unchecked things can still have requirements that need to be met.
I think the primary reason why Rust doesn't have explicit unsafe restrict pointers is that there's virtually no demand for them (that I've seen, anyway).
Unlike raw pointers with no aliasing restrictions, restrict is used super rarely in C because it's so hard to figure out when it's safe--I think it was only added to make C competitive with Fortran in certain benchmarks. Restrict means you can never alias, so storing one in a data structure is just tempting fate without the aid of the compiler, and if you just need it temporarily casting a * mut to &mut works fine. Rust programs certainly use restrict a lot more than any C program does, and I don't think adding an additional layer of "restrict without a lifetime" would be terribly beneficial in most cases.
Another important detail around pointers in Rust: it's common for "smart pointers" like reference counting (std::rc::Rc in Rust, std::shared_ptr in C++) and even boxes to be structs that contain pointers plus other information (like a reference count).
However, for ergonomic reasons, Rust wants you to be able to use a Box<T> or Rc<T> in the same way you would a &T, i.e. dereferencing with *t returns the underlying value in every case. Therefore, Rust allows you to overload the dereference operator with the Deref trait, e.g. the implementation of Box [0] does a double dereference to access the inner struct member.
Note that this is distinct from "auto-deref" (or deref coercion), another Rust feature that will automatically dereference pointers as necessary in certain cases, like calling a method on a struct (so there is no arrow operator "->" in Rust like in C++). See the Book [1] for more details.
That's not quite true: the reference counts for `Rc` and `shared_ptr` are not stored adjacent to the pointer. In rust these types are guaranteed to be pointer-sized (when the target is Sized) or fat-pointer-sized otherwise.
The reference counts are necessarily stored on the heap at the target of the pointer (the reference count must be shared between all the pointers) pseudocode:
Additionally, when you use `into_raw`/`from_raw` you're actually getting a pointer to the value member of the `RcBox` struct (ie. the reference count is stored at a negative offset from that pointer).
Sure, agreed that they aren't literally adjacent in the struct. My broader point is just that there's a level of indirection which is abstracted over by the Deref trait.
> there's a level of indirection which is abstracted over by the Deref trait.
There isn't, though. Box and Rc have the same number of levels of indirection as the built-in pointers: one.
The double * you see in the Box source you linked only arises because the deref method takes self by reference, the same way unique_ptr's operator* takes this as a pointer. (And further, that implementation looks circular? I believe that part of Box is still built-in.)
These are both functions that should always be inlined (in fact the unique_ptr implementation I'm looking at goes through at least four functions to retrieve the actual pointer) so any extra address-taking and dereferencing you see is merely compile-time bookkeeping.
> Note that this is distinct from "auto-deref" (or deref coercion), another Rust feature that will automatically dereference pointers as necessary in certain cases, like calling a method on a struct (so there is no arrow operator "->" in Rust like in C++).
Is this actually distinct? The docs for the reference type explicitly says that &T implements Deref for T, so it seems like it's the exact same thing to me.
> The simple answer here is that you cannot make a [T]. That actually makes perfect sense when you consider what that type means.
While this is true, I believe there's ongoing work to allow allocating [T] (and other dynamically sized types) on the stack under certain circumstances, using alloca. Which is quite nice since this was an area where Rust lagged behind C.
Ada can do this conveniently and it can be nice for efficiency and embedded systems (which may not allow heap allocation). It would be nice to see the same feature in Rust.
I'm not sure why the distinction for Box between a pointer and a struct with a single pointer in it matters. For all intents and purposes, aren't they the same thing?
At the binary level, yes, they're the same thing: a struct with one member is the same as just the member.
However, in the end, it's all just binary: that doesn't mean that using different phrasing doesn't help understanding. If it helps it make sense to the OP I'm all for it.
Sometimes I have a feeling that there is something wrong with the world in which the tools are much more complex than the things you make with them. (This not always has been the case; these days the signs of over-engineering and over-design are everywhere.)
It's sort of the opposite. If something is very simple, it's likely it is simple because you're standing on top of a massive tower of complexity and not worrying about it.
Rust is complex because it makes you worry about a bunch of stuff that other languages allow you to forget. But this also means that it allows you the control to create simpler things than can be created with the tools that make you feel like everything is simple.
I really want to like Rust, but coming from C++, the language just feels incomplete. There’s a lot you should be able to do safely but the compiler just isn’t there yet to figure it out. I was surprised to find you can’t safely pass ownership of a Box through a channel for example (that is, Box doesn’t implement Send).
I’m also not a fan of the crazy chained functions everywhere, and making ? do an implicit return makes it hard to visually scan code for control flow.
I’m hoping Rust matures into a better language. For now I think C++ is still far more productive and ergonomic, and there are ways of making it safe too.
I preferred the "try!" syntax. The ? is harder to parse visually, harder to find, and makes the language a little more complex (that's one more operator, one more thing to understand, one more thing to parse for tools, etc.)
I guess? If you’re transferring ownership of something to another thread, that’s perfectly safe. Needing to implement (??) Send for a struct or pod type is cumbersome when it’s something you’d do pretty often.
You don't need to implement Send for POD types – it is automatically implemented. Almost all types implement Send. There are only a few exceptions that I know of: Rc (the non-atomic reference counted pointer) doesn't implement Send because it doesn't support atomically updating the count. Also raw pointers don't implement Send.
Given that Send is autoimplemented for things that should be Send, this is a strong indication towards your code attempting to send actually not-thread-safe things across threads (things containing borrowed references or Rc<T>)
The error message does drill down and tell you the actual type causing the lack of Send impl.
The only time you have to manually impl Send is for custom container types that are built from raw primitives.
I like templates, because to date there is not a popular language that can do compile time stuff it can do. And no joke, some of the stuff you can do is quite amazing.
Good thing Send is implemented automatically for types based on their contents, then.
If you're trying to send a value to another thread, Rust will only stop you if the author of some type somewhere inside it has opted out, and none of its containers ever opted back in (as Box does).
For an example of the difference, here's a function that consumes a 1kb buffer, returning the sum some of its contents. No pointers or references at all, everything "passed by value".
https://godbolt.org/g/icoVvz
To my C programmer's eye this looks wrong, but from the output you see there is no memcpy. (Edit: this is wrong, both C and Rust do memcpy, see below comments, but I think my point stands.)
I'd be shocked to see a C programmer write the equivalent code ("passing by value a big struct to a function"), while in Rust it's not only idiomatic to write that code but frequently also necessary, depending on whether the function wants to take ownership its argument or some of its contents. (Imagine a more complicated 'stuff' struct that contained embedded pointers to other things.)
I'm a little embarrassed to say I ended up looking at the output on rust.godbolt.org a lot to convince myself that the bits of code I was writing was "ok", which is something I never did when learning e.g. Haskell or Go. I think the reason here is that Rust feels so close to C that there's a bit of an uncanny valley effect, where you can mostly pretend it's C up until you hit a wall.