From Stacks to Trees: A new aliasing model for Rust

saghm · on June 11, 2023

If I'm understanding correct, the major change here for Rust users (rather than people who hack on the compiler) is that mutable references will not be considered to be "interfering" with other references being made at the same time until they're actually written to for the first time. This makes intuitive sense to me, but I suspect that there may be a bit of concern that this will make things more confusing when reading code and trying to understand what's going on. I'd be lying if I said that thought didn't occur to me, but at this point being surprised at how much I end up liking the way things turned out has become the norm for me; I remember having misgivings about nested import paths (rather than only being able to use `{`...`}` at the very end), match ergonomics, and `.await` as a postfix keyword but pretty quickly became glad they decided things the way they did after using each of them a bit when they finally got stabilized. I think I did realize that I'd like NLL (i.e. the borrow checker detecting the final use of a reference and not considering it as conflicting for the remainder of the scope) before it landed, but I know a lot of people had misgivings about that as well. I imagine this will be one of those things that in a few years will seem weird it wasn't always how it worked!

PoignardAzur · on June 11, 2023

To be clear, this doesn't change what programs get accepted by the borrow checker, so for most rust users it changes absolutely nothing.

It changes the abstract rules behind rust's safety model, which impacts which unsafe functions are considered sound, and which optimizations the compiler is allowed to perform.

chrismorgan · on June 12, 2023

To expand a little on this, by pulling out one key piece of the article:

This is what the footnote of the bit about &(i32, Cell<i32>) clarifies (which footnote was added due to this misunderstanding, discussed in https://old.reddit.com/r/rust/comments/13y8a9b/from_stacks_t...).

> In particular, for &(i32, Cell<i32>), TB allows mutating both fields, including the first field which is a regular i32, since it just treats the entire reference as “this allows aliasing”.¹

> ¹ This does not mean that we bless such mutation! It just means that the compiler cannot use immutability of the first field for its optimizations. Basically, immutability of that field becomes a safety invariant instead of a validity invariant […]

This matter of safety versus validity invariants is key (https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants...).

saghm · on June 12, 2023

Ah, I guess I did misunderstand then. I see now rereading that the "don't treat as a mutable borrow until first write" is already how things behave today and have since NLL.

k1t · on June 12, 2023

That's not really true. If it were, this would compile since there are no writes.

  fn main() {
    let mut x = [1,2,3,4,5];
    let y = &mut x[0];
    let z = &x[1];
    println!("y {}... z {} ", y,  z);
  }

But it doesn't.

saurik · on June 11, 2023

> ...by the time x.len() gets executed, arg0 already exists...

So, I realize that this is the way that Java does it--and, presumably, one still doesn't get fired for doing whatever Java does ;P--but, would it not actually make more sense for the arguments to be evaluated before the target reference, making the argument order more like Haskell/Erlang (but very sadly not Elixir, which makes it awkwardly incompatible with Erlang and breaks some of the basic stuff like fold/reduce)? Particularly so, given that, as far as I can tell from this example, what makes arg0 have the type that it does is the type of the function that hasn't even been called yet? (As in, the semantic gap I am seeing between what the user probably meant and what the compiler wants to do is that "x" shouldn't really be mutably-borrowed until the call happens, and the call here clearly shouldn't happen until after the arguments are evaluated.) (Note: I do not program in Rust currently; I just have spent a number of decades analyzing languages and at times teaching college language design courses. I might be missing something obvious elsewhere that forces Rust to do this, but that one example, in isolation, at least feels like an unforced error.)

Georgelemental · on June 11, 2023

In Rust, `reciever.some_method(whatever)` is supposed to be relatively thin sugar for `TypeOfReciever::some_method(receiver, whatever)`. So the evaluation order should be the same for those two forms.

saurik · on June 11, 2023

Sure, but I am saying it would actually have made more sense to put receiver as the last such argument--as one might expect from having used Haskell/Erlang--given the other design decisions clearly in play here, as the target reference isn't really the first argument for any obvious reason other than visual effect and some historical baggage from implementations of some object-oriented languages (including Java) with different constraints.

estebank · on June 12, 2023

You're not wrong, but making self be the last argument would cause confusion for almost everyone coming from other languages, and for Rust it is too late to change that now. You could special case the behavior of the method call syntax to operate that way without breaking backwards compatibility (at the cost of making going back and forth between that syntax and the fully qualified call no longer being a straight forward syntactical transformation).

comex · on June 12, 2023

It would still break backward compatibility. As a trivial example,

    {println!("first"); 1u32}.wrapping_add({println!("second"); 1});

currently prints "first" before "second", but would switch.

But this could theoretically be done at an edition boundary.

tsimionescu · on June 12, 2023

Doesn't Rust support having a function as a field of a struct?

If it does, then the order of evaluation of a.foo(b) would depend on whether foo is a field or a "free-standing" function of a, which seems horrible.

Also, there is a simple elegance in having the order of evaluation match the order the symbols are written that should require a very hight bar to reverse, in my opinion at least.

MereInterest · on June 12, 2023

It does, but Rust also has separate namespaces for methods and variables. That is, a.foo(b) will always be a method foo and never a field foo, because the syntax is that of a method. In order to access a function object, then call it, you would use (a.foo)(b). The parentheses cause the contents to be parsed as a variable expression.

codesnik · on June 12, 2023

interesting. Can you expand on Elixir incompatibility? How is it different?

fredrikholm · on June 12, 2023

Elixir does:

  Enum.map(list, func)

This is backwards to how partial application works:

  def mapper(func), do: Enum.map(func)
  incrementer = mapper(fn x -> x + 1 end)
  incrementer(list)

It's not the end of the world to not have it this way, but it removes a lot of patterns that are common in other functional languages.

bPspGiJT8Y · on June 12, 2023

It doesn't remove or break anything, it just changes the order of arguments to be compatible with Elixir's (|>) operator and consistent data-first design across the language.

fredrikholm · on June 12, 2023

Definitely, I missed to mention that partial application via currying isn't something you do in Erlang either.

If you did however, swapping the arguments would break most partial application patterns you see in languages like Haskell, OCaml, Elm and even JS.

bPspGiJT8Y · on June 12, 2023

I do partial application in Elixir all the time. I have a function `curry/1` which takes any function and gives a curried version of it, and I have a `p` macro which introduces Scala-like "holes" (e.g. `(p Enum.map(list, _))`). The order of arguments still doesn't matter and nothing is broken or impossible because of it. I have also tweaked the (|>) macro (as well as some other operators) to support holes, so that I can "pipe" into whichever position of the call.

I think there are many more severe problems with Elixir which make it not even a remotely functional PL for me (rather, procedural + macros), but arguments order is not one of them.

fredrikholm · on June 12, 2023

I applaud your efforts, but that is a lot of work against the grain. I've yet to see a curried function in the wild outside of a handful of anonymous functions passed as HoFs.

> I think there are many more severe problems with Elixir ... but arguments order is not one of them.

Cheers, I have no dog in this game, merely expanding on what (I think) OP was alluding to.

bPspGiJT8Y · on June 12, 2023

It's definitely against the grain but also not really a lot of work. The currying function is ~3-4 lines. The macros are 5-10 lines each. As for examples of such things being used "in the wild", many use Witchcraft library which strives to give a Haskell-like experience to Elixir.

Yep, I have no dogs here either :^)

di4na · on June 12, 2023

yeah but at this point, currying seems to be a loss in ergonomics in most use case we have found in the wild. I love the idea, I love the principle.

But the jury is definitely still out for if the pattern of showing it in syntax and semantics actually is beneficial. And yes, I know nearly all the examples you can come with to show how useful it can be.

But the cost of having it seems to more than compensate the benefits.

fredrikholm · on June 12, 2023

I'm not necessarily a proponent of currying; merely wanted to expand on what OP was pointing at.

The reason I like Erlang is because of the BEAM (+ OTP) combined with idiomatic pattern matching, I don't miss currying in neither language.

weitzj · on June 12, 2023

I have not programmed Rust, yet. But this article gives me a feeling that this looks similar to database transactions. This might be wildly wrong but for me I see an analogy:

Once you get the &mut reference, you have your tree, which then looks to me like you have created a transaction. An in this transaction context you do your things.

oslac · on June 12, 2023

This is a good intuition, Rust's references can be thought of as a r/w locks.

MuffinFlavored · on June 11, 2023

    fn two_phase(mut x: Vec<usize>) {
        let arg0 = &mut x;
        let arg1 = Vec::len(&x);
        Vec::push(arg0, arg1);
    }

> This code clearly violates the regular borrow checking rules since x is mutably borrowed to arg0 when we call x.len()! And yet, the compiler will accept this code

Does anybody else wish the compiler wouldn't and would be even more verbose? I know one of the biggest learning curves (personally) for Rust is the borrow checker complaining hardcore and "getting in your way" preventing you from basically doing anything you're used to (passing around pointers in C or objects in JavaScript (even though you should be following immutable practices and not doing object mutation... most of the time))

I'm sure there's probably been discussions on how to make the borrow checker less "mean/rigid/obtuse" but silently passing something as "non mut" and it actually does "mut" stuff, I wouldn't have guessed Rust allowed that.

Edit: gah, I did not realize the function signature is (mut x), I thought it was just (x) and the mut was implied which is what I was trying to call out, apologies.

denotational · on June 11, 2023

The original code (which desugars to the snippet you posted) is:

    fn two_phase(mut x: Vec<usize>) {
        x.push(x.len());
    }

This should clearly be accepted (this is self evident in my opinion); if you need to jump through loops to write code like this then the language is too restrictive to write normal code.

The standard implementation of Rust does indeed accept this, and there is no soundness hole here.

The existing semantics for aliasing and borrowing from MPI (Stacked Borrows) don’t allow this, which means the semantics are overly restrictive; we want this to be accepted.

This work “fixes” this issue by extending the semantics to admit the behaviour exhibited by the standard implementation.

The rules for the borrow checker are not fully formalised and to some extent the rustc implementation is the specification; formalising the rules (i.e. RustBelt, Stacked Borrows, etc.) is important, but we don’t want to formalise something that is strictly more restrictive than the reference implementation, especially if there’s no soundness hole.

wongarsu · on June 11, 2023

The borrow checker was made for correctness, not correctness for the borrow checker.

You have ownership of a Vec, you get its length, then you push to it through a mutable reference; nothing evil happens here except the order of the statements (which is an implementation detail that people might not think about when writing the short form x.push(x.len())). The code above is perfectly safe if written in C, which is why the borrow checker was extended to also allow it in Rust. You could make the argument that simpler borrow checker rules lead to a simpler mental model. The counterargument (that won in the end) is that "if it's safe, the borrow checker allows it" is a mental model worth pursuing.

TazeTSchnitzel · on June 11, 2023

> silently passing something as "non mut" and it actually does "mut" stuff

No, it's the opposite that's happening here: a mutable borrow of the vector is made, and then a non-mutable thing is done with it (getting the length), before finally mutating it (pushing).

Osiris · on June 11, 2023

I’ve been learning rust and I spend the vast majority of my time dealing with lifetimes and borrow checking. Common ways in used to doing things simply don’t work in rust and a lot of effort has to go into keeping track of how and where data is used.

I’ve worked in OOP languages, functional languages, and dynamic languages but all of them were essentially garbage collected, so having to keep track in my head of how data ownership is managed is a big learning curve.

imron · on June 11, 2023

As a c++ programmer, one of the great things about rust is that I no longer have to keep track of data ownership and management in head.

I can outsource this to the compiler and if I get it wrong the program won’t compile.

In c++ you still need to do all the same tracking and management if you want safe and correct programs, but you don’t get nearly as much help from the compiler if you make a mistake.

FpUser · on June 11, 2023

I think this is largely overblown if one uses modern C++. One of the things I do is stateful multi-threaded business servers and frankly comparatively to the overall project this "data ownership maintenance" is small to the point of being practically absent.

FpUser · on June 11, 2023

>"Does anybody else wish the compiler wouldn't"

Compiler being obtuse and not being able to figure when it is safe to "break rules" is the problem. Not twisting brain of the programmer into being "safe compiler". This sounds like a Stockholm syndrome.

>"you should be following immutable practices"

No I should not. I should do what makes sense in particular situation and not bending over for some zealots trying to enforce one and the only way.

chlorion · on June 12, 2023

In C++ or C, you are always twisting the brain of the programmer into being a "safe compiler". I don't think that is an advantage personally.

Lifetimes don't go away just because there isn't a borrow checker or way to define them in the source code.

FpUser · on June 12, 2023

I do not have this impression. As for managing lifetimes in modern C++ I've already stated elsewhere that from my personal experience this problem practically does not not exist for application level programming. People writing OS level code will of course disagree but luckily I am not in that domain. I do write code for low power microcontrollers but I use plain C and do not have any real problems as there are no allocations / freeing. Just be careful with interrupts when handling shared data.

verdagon · on June 13, 2023

This isnt quite true, you can have memory-safe single ownership without borrow checking, and it feels quite different than what Rust has us do.

neerajsi · on June 11, 2023

I think one measurable outcome here is what kind of error message you get when you do violate a rule and whether rust users know what to do to fix their code. As a person who loves to explore the complexity behind seemingly simple interfaces, this stuff is really cool. On the other hand, I don't relish having people break their brains to understand why similar code is accepted vs not.

I'm not a rust user myself, but I'm guessing from all the references to raw pointers that a lot of the code referenced here is actually not idiomatic for all but small snippets of high perf code, so maybe the complexity is not going to affect too many people.

FpUser · on June 11, 2023

>"so maybe the complexity is not going to affect too many people"

I think this approach shows a high level of disrespect for users.

neerajsi · on June 12, 2023

I work with many people who are quite intelligent but early in their career or not domain experts in PL implementation. These people are perfectly respectable, but how long would it take to teach them how to map their source to the lifetime dependency tree with subtle rules in order to understand a borrow checker result that triggers an error? Without that understanding, a dev using rust would maybe try poking at their code unsystematically in hopes of getting it to work. I've seen this happen in other domains while people are ascending the learning curve.

FpUser · on June 12, 2023

I have to apologize here. Not sure what was with my brain at the moment but I've misunderstood your entire original reply.

Ygg2 · on June 11, 2023

> Compiler being obtuse and not being able to figure when it is safe to "break rules" is the problem.

Compiler afaik will never be able to correctly 100% identify you are or aren't breaking some properties due to Rice's Theorem.

That said, you're committing a Nirvana fallacy. Perfect doesn't prevent improvement.

E.g. seatbelts don't prevent being stabbed by a large metal pole, ergo it's useless.

Every week I see newbies coming and asking why won't compiler allow this - and then point a hugely unsafe action.

Hell, I ran into a similar issue. I wanted to expose something mutable as immutable. My argumentation was but it was immutable at time of calling. However as someone in Rust discord pointed, using that you could cause UB trivially.

FpUser · on June 12, 2023

>"Compiler afaik will never be able to correctly 100% identify"

Nobody here is talking about 100%. I responded to a post that has left me with the impression that it is up to the user to bend backwards and make their brains work as a compiler rather than try to improve compiler.

Ygg2 · on June 12, 2023

> that it is up to the user to bend backwards and make their brains work as a compiler

What do you mean? You always have to track lifetimes and what outlives what (i.e. work of a compiler). Especially in C++. Not doing that results in UB.

In Rust you have a compiler double checking you. And it errs on side of caution. And no, errors aren't horrible, they come with suggestions for fixing them.

classified · on June 12, 2023

Interesting, and a nice demonstration of how a language gets more complicated "all by itself".