Hacker News new | past | comments | ask | show | jobs | submit login
Memory safety without borrow checking, reference counting, or garbage collection (verdagon.dev)
111 points by modernerd on June 16, 2023 | hide | past | favorite | 93 comments



This is not memory safe. Having generational pointers to do epoch checking on dereference is a stochastic mitigation. It is equivalent to PAC or Chromium MiraclePtr but with even less defense against adversarial attackers, because any infoleak of the generation allows for forging of UaF pointers.

The "high RAII" concept is also not actually misuse proof. There doesn't look like there's actually anything preventing the user from calling free themselves instead of RemoveShipFromDisplayCache. The types are linear, but the final sink is always free which doesn't encode the full usage semantics.


It depends. There are two options of generational references we've been experimenting with:

* Table-based generational references where the generations are stored in a separate table, and we retire overflowed slots. This is guaranteed memory-safe.

* Random generational references which are stochastic and enable storing objects on the stack, and inline in arrays and objects.

The latter's memory safety depends on definitions and baseline:

* It's not as safe as completely memory-safe languages like Typescript/Javascript.

* It's safer than C or C++.

* It could be more safe or less safe than Rust, where most programs use unsafe directly or in (non-stdlib) dependencies [0]. For mutable aliasing, instead of trying to properly use unsafe in Rust, a Vale program would use generational references which are checked (leaving aside the RC capabilities of both languages).

It's a very strong stochastic measure. Whereas simple memory tagging uses 4 bits and has a false positive rate of 6%, these are using 64 bits for a rate close to zero. Combined with how they fail very loudly, invalid-dereference bugs don't lie in stealth for years as they can in unsafe and C programs.

Additionally, one can code in a way that doesn't use any generational references at all. Ada has SPARK, Rust has Ferrocene, and Vale has linear style + regions [1].

Still, if one doesn't want to think about any of this, then safer languages like TS/JS are great options.

You're correct that if a generation leaks then it could be a problem for this C-like's approach. Vale largely solves this by making it so the user code can't read a generation (though like in any language, unsandboxed untrusted code can get around that of course).

[0] https://2020.icse-conferences.org/details/icse-2020-papers/6...

[1] https://verdagon.dev/blog/first-regions-prototype


> where most programs use unsafe directly or in (non-stdlib) dependencies [0].

Interesting study I hadn't seen before. I think stdblib-vs-external crate is a bit arbitrary of a definition. Rust deliberately chooses to have a relatively small standard library, and there is a core set of special crates written by the similar people and with similar quality to the standard library. Hashbrown is a particularly obvious example: the standard library HashMap and HashSet are a thin wrapper around the external crate, so using the external crate is just as safe.

I'd be interested in a more recent paper that includes miri test coverage.

The authors also have a good caveat that it's 60% of popular libraries, not 60% of codebases. If you browse around crates.io comparing download stats to dependency stats you'll see download stats are far far larger, but that the difference varies wildly depending on the type of crate. Crates that tend to be used by applications but not libraries tend not to be depended on by crates published to crates.io. I'd expect these crates to be less likely to use unsafe. But it's still clearly a lot. (And I think my point above that there isn't a clear stdblib vs external crates distinction cuts the other way as well: there have been soundness bugs in the standard library)


as someone with an interest in the subject but not a domain expert, I have an imprecise understanding of what you both are referring to when you say "a stochastic measure."

do you have any resources that you could point me to for reading up on stochastic memory management?

(apologies if this is a stupid question)


IIUC "stochastic" in this case means selected via random or otherwise difficult to predict process. In contrast to Deterministic.


Mostly I think you're right, though I don't think this is pointer tagging a la PAC or auto-zeroing (etc.) a la MiraclePtr, this just looks like compiler support for unique_ptr.

I'm with you on the high RAII: it sounds like destructors with some important missing steps. It also looks like the idea of "SOPs must eventually be freed" runs into some immediate Turing problems, i.e. you can't know an SOP will be freed when you have if statements, or a cycle of functions passing an SOP around that conditionally free an SOP depending on some external state, or program behavior contingent on external messaging, etc.

I think another place you'd run into problems with this is... you actually do want references to heap allocated memory. It's pretty laborious to get by with unique_ptr all on its own; you want to lend out shared references that you get back as their scopes close, and so on. It doesn't seem like there's any facility for that here.

To people trying to build systems like this without building Rust's system: honestly just try it in C++. Sure there won't be compiler support for it, so you'll have to manually insert stuff, but just see how workable it is. If you were working with this system in a program of any appreciable complexity you'd immediately realize:

- You don't reliably know when things haven't been freed

- You really want references

---

An interesting alternative way to look at what's proposed here is, this just an oblique message passing system.


By this definition Rust is also not memory safe since you can use unsafe to free a pointer. You normally don't have access to the generation so you can't forge it without unsafe code.


But Rust doesn’t claim to be safe if you use unsafe blocks. It’s in the name.


There's no way in Vale to get the generation count of a generational reference without unsafe code, so I really don't see the difference.


> There doesn't look like there's actually anything preventing the user from calling free themselves

True, but wouldn't that still be memory safe? It seems to me that it could lead to a memory leak, but not to double use of memory. But another annotation to the type so that free() can only be called at one point (or perhaps a few points) might solve that.


It doesn't leak memory, but the example use case of "high RAII" isn't about not leaking memory but that "the compiler can ensure you remember to do something in the future and don't accidentally forget it" and "remember to remove something from a cache that you previously added to" - which isn't actually done, because the consumer can call free and the data will still erroneously be in a cache, for example.


An adversary can turn a double free into a use after free by causing an object to be allocated in-between the two free operations at the same address


You can't free a buffer/object twice in the proposal. After a free, you can't use the pointer any longer.

But it might very well be that this approach is not hard enough for use in e.g. the kernel. It might still be useful for userland applications, though.


This is memory safe if you never re-use a (generational index, address) pair.

They seem to use 64 bit ones so this means that to overflow a single memory cell you would have to allocate and free it every cycle (!) at 4 ghz for about 150 years.

Basically as long as you accept this incredibly small "memory leak" you're fine.


By memory leak you mean that every memory location has a limited number of alloc+free and once it passes that, the memory is unusable?

Maybe you should have said, "to overflow the generational index you would have to exclusively allocate and free it with no work being performed for 150 years. Once the overflow happens the only consequence is that this memory cell will never be allocated again."


It was an absolute pleasure having Verdagon speak in detail [0] about memory safety at Handmade Cities [1] last year. Check out the conversation if you were left wanting more after this essay.

[0] https://handmade.network/podcast/ep/afc72ed0-f05f-4bee-a658-...

[1] https://handmadecities.com


A great read!

Might be worth pointing out: leptos, a Rust WASM UI library, uses the "store an index (or ID) into some central data structure" idea extensively.

Sycamore, another Rust WASM UI library, has been using arena allocation, but will also be transitioning into the leptos approach in coming versions.

(I believe) this is partly because the borrow checker's enforcement of a tree-like data structures / ownership doesn't always work well with the JS event loop.


I’ve always thought this is a pretty big blindspot in Rust. Having a raw pointer and having an index into an array are essentially equivalent, and doing this in Rust is a way to turn off the borrow checker for this particular object.

The index is bounds-checked, sure, but lifetime issues like “double free” or “use after free” are every bit as present with indexes. For instance, if you have an index in one place, but it’s deleted and/or reinitialized to some other place, you are now holding on to an object in an undefined state. Using it will cause the same issues “use-after-free” does in C. Best case scenario in that cases is that it crashes, but it’s just as likely your program will just silently be wrong.

I’m not saying not to do it, I’m just saying this is something Rust developers needs to be aware of: storing indexes like that removes a whole bunch of safety guarantees the borrow checker checks for you, without any code marked “unsafe”.


I haven't had need to use it, but an improvement on the "I have an index to an object which has been replaced" has been to add a generation to the collection & index, a la the generational_arena crate. If the generation attached to your index doesn't match the generation in the collection at that index, your index is stale.


> I’ve always thought this is a pretty big blindspot in Rust. Having a raw pointer and having an index into an array are essentially equivalent, and doing this in Rust is a way to turn off the borrow checker for this particular object.

You've been poisoned by C, where this is literally how arrays work, they're just pointer arithmetic plus syntax sugar (which is why arr[x] == x[arr])

Rust's array type is an actual array, and so a lot of the problems you're worried about (which would be big pitfalls out of the box in C) only happen in Rust if you went to some effort to dig a hole for yourself so you could fall into it.

> For instance, if you have an index in one place, but it’s deleted and/or reinitialized to some other place, you are now holding on to an object in an undefined state

This idea of "objects in an undefined state" is a normal part of C, but Rust doesn't have it. So you'll have to go invent it for your types, so as to inflict this injury on yourself, and I have no doubt you'd eagerly do so maybe even persuading yourself that this way is "faster" or some such nonsense.

You might sensibly store the indices as owned pseudo-reference types, akin to Unix file descriptors and various Windows handle types, both of which Rust manages as opaque owned objects (ie they don't implement Copy) and borrowed objects (which do Copy)

Contrast:

https://doc.rust-lang.org/std/os/fd/struct.OwnedFd.html https://doc.rust-lang.org/std/os/windows/io/struct.BorrowedH...

When you treat them as opaque handles, it becomes more apparent whose job it is to ensure the handles work as intended. Nobody would say "Oh that's your problem, you used it wrong" if your open file handle to the settings file suddenly morphed into stdin because of unrelated code.


I wrote a longer response to a sibling comment, but I highly dispute this:

> Rust's array type is an actual array, and so a lot of the problems you're worried about (which would be big pitfalls out of the box in C) only happen in Rust if you went to some effort to dig a hole for yourself so you could fall into it.

It is an extremely easy hole to fall into. If you hold on to an index X into an array, and some different part of the program deleted an entry in the array with an index smaller than X, your X is now no longer valid, it's pointing to the wrong object. This is exactly equivalent to an "invalidated pointer", which is an issue the borrow checker was designed to solve. But there is no way for it to know this, because by using indexes instead of pointers, you've disabled its ability to analyze this situation.

I agree that pointers in C and indexes in Rust are not the same thing, and there are other issues with C pointers that Rust's borrow checker solves. But imagining that you're magically safe from any lifetime issues "because Rust" is foolish.


> some different part of the program deleted an entry in the array

That's not a thing. This is the linchpin of your argument, and it just doesn't exist.

Try it, make an array of, I dunno, Strings, and then "delete" one. You can't, there is no "delete" operator, that would be meaningless.

People with some brief exposure to Rust might think, "OK smart arse, it's called drop, I can just drop the String". Nope. That's an array of Strings, you can't just drop one of them, it's part of the type [String; N] it's not OK if one of them isn't a String, so you're going to need to consciously swap it for a different String, you could core::mem::take that element of the array, leaving its default (an empty String) behind and then drop the one you took. But what we've got here isn't an "invalidated pointer" we've just got an ordinary logic error in our program where we mistakenly empty strings we still wanted.

You have to go build your own pit, and fill it with spikes, and jump into it, all so that you can moan, "Oh, Rust basically has the same terrible pit traps as C or C++" except, you built the trap, so don't do that.


> Try it, make an array of, I dunno, Strings, and then "delete" one. You can't, there is no "delete" operator, that would be meaningless.

Your argument is that there's no such function as `std::Vec::remove`? I can assure you, it exists: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.rem...

If you use that function, and you've held on to a an index that is no longer valid, it is pointing to the wrong string. Or worse: it's pointing outside of the array, and your program will panic. The index is no longer valid, just like a pointer in C is no longer valid after calling free().

> You have to go build your own pit, and fill it with spikes, and jump into it, all so that you can moan, "Oh, Rust basically has the same terrible pit traps as C or C++" except, you built the trap, so don't do that.

I am not saying "Rust basically has the same terrible pit traps as C or C++". I have never said that, and I never would. What I'm saying is that if you're using indexes like this, you no longer have the benefit of lifetime safety guaranteed by the borrow checker. You have, in essence, turned the borrow checker off for this part.

BTW, making the argument "as long as you're not a bad programmer, this isn't a problem" is exactly the argument that some Rust haters make: "well, just don't call free() twice, and it's not a problem!". Either way, it's bull: the whole point of the borrow checker is that it statically guarantees that these kinds of lifetime issues can't happen. But if you turn it off using indexes, it guarantee is out the window. The whole discussion started with the parent commentator pointing out a project that does exactly this, and me pointing out that this is a problem from a static safety perspective.


I think the disagreement here is about what happens when you mess up.

In C, if you have a UAF, the attacker likely owns your entire process. They can call any function with any arguments. They can modify entirely unrelated objects. Your backups get encrypted and your secrets show up on dubious websites. You lose, badly.

If you roll your own arena in Rust and you “UAF” a struct, you either panic or you continue executing, with a correct control flow graph, with no language-defined UB. The attacker can do only what your code can be convinced to do. Sure, if you implemented an entire VM based on your arena, maybe the attacker can convince your VM to encrypt your backups and leak your private data. But this only happens if your VM can do that in the first place. If your code just draws creatures in 3D, then all the attacker can do is mess with the creatures.


There's no disagreement. OskarS is just plain wrong and now backpedaling. They specifically said that:

>Having a raw pointer and having an index into an array are essentially equivalent, and doing this in Rust is a way to turn off the borrow checker for this particular object. [...] The index is bounds-checked, sure, but lifetime issues like “double free” or “use after free” are every bit as present with indexes. For instance, if you have an index in one place, but it’s deleted and/or reinitialized to some other place, you are now holding on to an object in an undefined state. Using it will cause the same issues “use-after-free” does in C.

... which is false. Later they shifted the goalposts to:

>If you use that function, and you've held on to a an index that is no longer valid, it is pointing to the wrong string. Or worse: it's pointing outside of the array, and your program will panic.

... which nobody would've disputed if they'd said that initially.


That's not an array, that's a Vec. Notice that you need an exclusive mutable reference to the Vec to call this method, so if somebody else gave you a slice reference backed by that Vec, they can't do this to the underlying Vec as they no longer have exclusive access to it while you have that borrowed reference.

If you've decided to treat indexes into a Vec as handles, then yeah, you mustn't use Vec::remove because it's nonsense in your context. You also mustn't Vec::retain and numerous other "I don't care about ordering" operations, because you do care about ordering.

I'd argue it's probably sensible to make your own custom container which cares about ordering for this work since that preserves the invariant you've decided upon, rather than trust yourself not to make mistakes. You can build one out of a Vec pretty easily.

And sure enough that's exactly what the project you're referencing seems to actually do, they've got themselves a custom container which preserves their intended semantics.


This entire argument is dumb.

No one is using a Vec<T> with indexes for long term storage if you know you will remove and replace items there, it's obvious regardless of programming language that it's a bad idea.

You use a dedicated data structure for this where the keys have a generation and that solves the ABA problem.

Rust has multiple crates that provide such data structures.


Agree with you but not for your reasons. Too many folks are trying to make this argument about security when its more about language design and engineering.

If most Rust libs migrate to use index based memory references, then what is the point of the borrow checker ? It would be demoted to a rarely used 1% feature leveraged only by Language Complexity Fanatics - might as well remove it from the language and speed up the slow Rust compile times by orders of magnitude.

The growing adoption of handle/index based resource management is also demonstrating that borrow checking is effectively a failure. Most library authors are beginning to move away from borrow checking as it imposes severe limitations on software design and usability and severe increases to cognitive complexity.


No one has claimed that “most library libs are migrating to index based memory management”.

I mentioned two crates that are DOM frontend frameworks, that also are based on fine grained reactivity (and not a virtual DOM). That’s an incredibly specific niche of rust libs.


Here's the implementation of "remove a value" [1] from the popular Rust crate that implements this pattern.

It completely sidesteps the issue you are describing because "deleting from array" does not mean "adjust every greater index by one".

However, I'll grant you that using this crate means you have to extend _some_ trust (even if very small) to the crate maintainer to maintain safety invariants, in addition to Rust's compiler maintainers.

But the fact that I could dive into its code (that I've never seen before) in 2 minutes and verify that it is not susceptible to the issue you describe gives me more confidence that auditing+trusting this crate will not be an undue burden.

[1] https://docs.rs/slotmap/latest/src/slotmap/basic.rs.html#395


> Having a raw pointer and having an index into an array are essentially equivalent

You can split the handle (index) into an upper "magic" and lower "index" part, with the handle being an opaque type, only being able to be accessed and modified by the container giving out handles. In C++, you'd likely see operator* and operator-> overloaded for usage. This technique is described in the original "Game Programming Gems" book.

Not 100%, but on usage, the handle checks the magic against what's currently in the data-structure to let you know if the handle has been freed and/or reused. Since handles are opaque, you can muck with the underlying objects as you want, e.g relocate them in memory, having multiple internal arrays, etc.


Reusing a dead index is bad, but not as bad as full memory unsafety. It can cause application errors for sure but the integrity of the runtime is preserved.

Segregated, typed, heaps (never reuse the memory for a different type after a free) could allow similar safety.


I would say that this is an extremely narrow view of "memory safety", and totally ignores the broader concept of "lifetime safety". Yes, if the thing you want to prevent is crashes due to segfaults, indexes in Rust prevent that. But I would argue there's no practical difference between "crash due to segfault" and "panic due to index overflow" (which is definitely possible with the Rust model): both are just different ways of looking at "you dereferenced an invalid pointer". It does prevent the pointer writing into different data structures though, which is definitely an advantage of Rust (which comes with the runtime cost of bounds-checking, but that is probably worth it).

But more generally: imagine you're making a game where you shoot a bunch of dudes, so you have a `std::Vec<Dude>` to store them, and elsewhere in the code, you reference this vec using indexes. When you shoot a dude, you want to remove it from this array, but being a programmer concerned about performance, you don't want to literally delete the entry, as that is an O(n) operation. Instead, you mark the dude as "despawned" or whatever, and then when you want to spawn a new dude, you keep a list of despawned dudes and just recreate the new dude in the same place in the array. If other places in the code are holding on to indexes to this array, the index is now invalid, but it will not be detectable: it'll just point to the wrong data, which can wreak all sorts of havoc.

This is almost exactly equivalent to "use-after-free" issues in C. In most cases, `free()` will not literally free the memory, it will mark the memory as unused and stored in a free-list, to be available for the next `malloc()` call. If you're holding on to a pointer that has been free()d, it's not that the program will segfault, it's much more insidious than that: it will point to memory that is apparently valid, but actually garbage.

This is precisely the kind of lifetime issue the Rust borrow checker was designed to solve, but by using indexes like this, you've basically just turned the borrow checker off without realizing it. If the borrow checker is intrusive enough that this becomes a very common pattern in Rust, I personally think it does serve as a pretty good counter-argument to a lot of things people say about Rust. Rust people are very quick to sing the borrow checker's praises and say that "as long as you learn to work with it, it stops becoming a problem!". If it stops becoming a problem only because you sneakily turn it off using tricks like this, maybe it's a bigger problem than people are willing to admit.


> But I would argue there's no practical difference between "crash due to segfault" and "panic due to index overflow

One leads to undefined behavior and is potentially exploitable in a way that allows for full control over your program's execution. That's the difference.

No one is arguing that the bug you've described isn't expressible in Rust - it obviously is. It's just not what the borrow checker is for because the borrow checker is for memory safety issues, and you're describing a logical bug.


From a security perspective, yes Rust is better, but it's still a security issue: if you can make a Rust program panic due to overflow, that's a DoS attack.

I think this description is a very narrow way to look at memory safety (and again: totally ignoring the broader issue of lifetime safety), if I'm going to be honest. In my Rust program, I have two functions `spawnDude()` returning an index and a `despawnDude()` taking an index. In C, i have `malloc()` returning a pointer and `free()` taking a pointer. The lifetime issues are the same: just like I shouldn't `free()` a pointer twice, i shouldn't `despawnDude()` twice, and I shouldn't use a dude after I've despawned him. The implementation could even be very similar: using arenas (which is essentially what the Rust array is) and free-lists.

Again: these were the issues Rust was designed to solve, and the borrow checker is the tool it uses to solve them. And it absolutely does do that, if you use the native Rust constructs: this is the true super-power of Rust. The reason why it's so much easier to work with indexes is because you've deliberately chosen not to have the borrow checker analyze this situation. If that is something you have to do a lot of (and I've seen it a number of times, not just in the project the parent mentioned), it does say something important about how the borrow checker limits the expressive power of the language, if you have to turn it off in this way.


> Again: these were the issues Rust was designed to solve

No. That's the end of it. The answer is no, you are wrong, that is not the case. The rest of your post is fine but irrelevant.


Your failure to distinguish memory errors from logic errors does not mean that rust didn't succeed at eliminating the former.


I fail to understand the notion between "memory safety bugs" and "logical bugs"? E.g. logical bug is what can lead to memory safety bugs so I don't quite understand the point being made.


Can we not take malicious advantage of panicking the same way we do with use-after-free?


Panicking can be leveraged for a DoS attack. Use after free can be leveraged for arbitrary code execution.


How about exceptions?


Panics are implemented in the same way as exceptions are in C++, beyond that Rust doesn't have a concept of exceptions at all. Fallibility is expressed in terms of sum types that signal state for success or failure with Result.


> Panics are implemented in the same way as exceptions are in C++

If this is true or at least fundamentally very close to each other, and given that exceptions can be abused for arbitrary code execution [1] [2], then it follows that Rust is no different/safer with respect to that, no?

[1] https://billdemirkapi.me/exception-oriented-programming-abus... [2] https://billdemirkapi.me/abusing-exceptions-for-code-executi...


No, you can't.


> But I would argue there's no practical difference between "crash due to segfault" and "panic due to index overflow"

Nope. If you're lucky, you get a segfault.

> imagine you're making a game

Actually, I would love to be making games. The benefits Rust provides aren't huge benefits in Game Dev, we get it. In fact, isn't this whole thing just a rehash of Jonathan Blow's ECS critique? But most of us aren't making games.

Imagine instead you're a not making a game, and you see CVE after CVE because of memory safety. Customer data gets stolen, things like SSNs. People's lives are impacted. I suppose a game crash is annoying, but way less annoying than someone's/my personal info being leaked.

Sure, we could go on as usual, blame "human error" and say everybody needs to do better. Or, we could accept better over perfect, and try to mitigate common errors that seem to happen often. You can still use `unsafe`. I guess by this argument, seat belts are useless, too, because they don't stop traffic fatalities completely. And so are all aviation safety improvements.

We're still here having the "Rust offers no/marginal benefits (to game dev and other niches)". Ok.


I want to be clear: I think Rust offers ENORMOUS security benefits over C. No one is disputing that. I personally love Rust, I think it's a great language. This is not an attack on the language, this is an observation that if you use this particular pattern (indexes instead of pointers), you are doing something dangerous, and you can no longer rely on the borrow checker to validate lifetimes.

It seems to me to be a pretty unarguable point. It's frankly a bit shocking to me to see this vehement of a disagreement.


I think we are all in strong agreement, the only disagreement is on the level of badness of the issue :)


I think there are security vulnerabilities I’ve read about Mach that stem from tricking the kernel into accessing an object as the wrong type. This kind of type confusion would be possible if you had a heterogenous container and had stale index references.


> But I would argue there's no practical difference between "crash due to segfault" and "panic due to index overflow"

There is huge difference between the controlled and guaranteed teardown from panic in Rust and maybe segfault maybe nasal demons in C.


This is a really good analysis, and it points to a central flaw in Rust. As a practical matter, you have to have more than one pointer to a data structure all the time. You really can't implement some things without it.

Rust makes this really, really difficult. If you can live with immutable pointers, you can have as many as you like, but the lifetime syntax you have to juggle can become a horror show, and can force lifetime annotations on a whole chain of structures. If you can't live with immutable references, then you are forced to use the unnecessarily ugly Rc<> syntax.

I often find myself using the array indexes trick, which leads to buggy code.

In Java, I never had to do this. Multiple "pointers" are fine. The garbage collector takes care of memory safety. In this one sense, Rust is a less-safe language than Java.

I have been writing Rust for two years now. I would not use it again if I could avoid it. It is a giant waste of time.


I'd say this is a case of the 'Inner platform' effect (Using a tool to build a poor replica of the tool) . Pointers are nothing more than indexes in the memory array. It turns out keeping track of what index does what is so hard, rust has a borrow checker to help with that. But sometimes it's annoying, so we create a workaround, and if we go too far with that workaround, we forget why we had a borrow checker to begin with.

Notice how the array technique has parallels to memory: You can have multiple arrays of different types a.k.a segmented memory. You can mark unused array objects, a.k.a. overwrite free'd memory with 0xDEADBEEF.


It's all memory in the end. Hide from it, fear it, run from it? It'll only cause you to develop maladative coping patterns like writing Go instead of beautiful C.

Instead, attain Nirvana, open emacs and embrace Fearless Development.


> Pointers are nothing more than indexes in the memory array.

Pointers in C point to objects and not memory locations afaik, which has subtle semantic implications. See the whole discussion around pointer provenance.


I feel like the "Store an index to get around the borrow checker" largely moves the problem - now you have A) array-bounds-checking bugs, and B) either "list was re-ordered, you now have the wrong item" or "can't compact the list, it just keeps growing even when old items are no longer used" bugs.

Sure, you no longer have to satisfy the borrow checker - but it was a tool to help you catch bugs, and you've cleverly avoided getting that help.


> array-bounds-checking bugs.

Rust tries very very hard to avoid bounds-checking bugs, because it's central to the safety promise, maybe I'm misunderstanding you?

> either "list was re-ordered, you now have the wrong item" ... "can't compact the list, it just keeps growing even when old items are no longer used" bugs.

I have no direct experience implementing this pattern, but it's good to know that leptos (and others) tend to use one well-tested library that addresses these concerns[1].

> Sure, you no longer have to satisfy the borrow checker - but it was a tool to help you catch bugs, and you've cleverly avoided getting that help.

My impression is that one can perhaps claim "one traded compile-time checks of safety for runtime _panics_ when unsafe things are about to happen", and not that using this pattern exposes one to the same issues that the borrow checker avoids.

[1] https://docs.rs/slotmap/latest/slotmap/index.html#why-not-in...


Yes, the failure here is a panic, typically a crash (of at least the affected subsystem), not memory corruption.

Slotmap avoids this via unsafe + implements its own handle system, which is probably the right approach.

The borrow checker avoids many other issues in addition to memory safety.


> “store an index (or ID) into some central data structure”

Sounds like OpenGL object management. Shudder. This is not a step forward.


Interesting. So basically the index storage is a way to create a reference to an object that is not compiler checked?


I would say that it's a way to trade compile-time checks of reference validity for runtime-checks of reference validity.


It would be nice if this explained how to deal with data structures containing cycles. Want to solve a Traveling Salesman problem? Each city object would logically have pointers to neighbors. Multiple pointers to each city, no clear rule for which ones own which others, cycles everywhere. Yes, you can solve this by putting the owning pointers into an array and using indices elsewhere, or similar tricks, but that still leaves you prone to leaks. It's easy to get memory safety - no use after free, buffer overflow, etc. - if you don't care about leaks. The real trick is to come up with something that's also provably leak-free even in the presence of cycles.


This should work just fine in Vale, instead what will happen is once you delete a city, and you try to access it from another city through a reference, your program will panic.

Generational references are nothing new, they are often used whenever you have an object pool. The novelty is in applying them to the whole program, effectively turning use-after-free into an out-of-bounds panic of sorts.


I'm not sure that "it's possible for your program to do this thing and if it does the program will panic" really qualifies as memory-safe. Maybe it's better than what you'd get in C, but in some kinds of programs it could still be a DoS vector. In a garbage-collected language or unsafe-free Rust it would not be possible to crash because of a deleted city, and that would be strictly better.


The vast majority of rust application code has unwrap()s everywhere, not to mention bounds checks on arrays, both of which trigger panics.

You can always check the validity of a generational reference before every use, just as you can check the array length or the Result<T,E>


Vale's “generational references” sound like some kind of holy grail, which makes me worry I'm missing something. Surely someone has tried them before? Why aren't they already in common use? I think they would constrain memory allocation and promote heap fragmentation, which might be fine on a system with ample 64-bit address space, but could be troublesome for WebAssembly for example. They also don't seem like they avoid the concurrency issues that Rust's borrowing model can.

Also, it's unfortunate that Vale is one letter away from Vala.


The (related) idea of a generation counter combined with an index handle for spatial and temporal runtime memory safety has been quite popular recently in the gamedev space:

- (shameless plug): https://floooh.github.io/2018/06/17/handles-vs-pointers.html

- Seb Aaltonen's recent engine design presentation (search for "Fast & safe object lifetime tracking"): https://enginearchitecture.realtimerendering.com/downloads/r...

Some gamedev engines probably invented a similar concept in parallel over the last two decades.

...having things like this baked into the language is quite a bit nicer of course, and should allow for better performance.


There are also programming languages named "Dala" and "Dale".

Dala and Vale are memory-safe. Vala and Dale are not.


There's also "Val", again banking on safety.


It was first explained in the Old (2000 published) Game Programming Gems book in the chapter titled: "A Generic Handle-Based Resource Manager".


> It was first explained in the Old (2000 published) Game Programming Gems book in the chapter titled: "A Generic Handle-Based Resource Manager".

It's much older than that. The capability-secure operating system EROS from the early 90s [1] used this technique in its capabilities. Capabilities were kernel-protected 64-bit numbers that mapped to system objects, and they carried a 16-bit generation index that was used to revoke the capability (basically a free operation). It might have even been used in its predecessor KeyKOS from the 80s, I don't recall at this point.

[1] https://en.wikipedia.org/wiki/EROS_(microkernel)


Huh, didn't know that - thanks for the link. The very ambitious name for the OS - Extremely Reliable Operating System made me smile.


ah! Reading about Vale, I was actually thinking about Vala, now I see! (I barely know Vala, just from compiling some GTK/Gnome related code).


So he mentions CHERI, but fails to look into SPARC ADI, ARM PAC and MTE, all already in production, with SPARC ADI for at least a decade.


He links to a page about ARM MTE in the section where he talks about memory tagging and CHERI for the first time.


Indeed, but then maybe it should have been more explicit.

Memory tagging as concept is quite old, so I would even expect that link to point into Burroughs Large Systems, Lisp, or other hardware of similar vintage.


I'm a bit confused on the meta level.

If it actually works well in practice, why aren't there quite a few memory-safe, single owner-based systems programming languages out there? Is it such a new idea? (The author argues that it's not, C programmers have been doing it without compiler support for ages).

Or is only a part of it new? If yes, which part?


The example somehow reminds me of proof carrying code. If you obtain memory in pcc, you also get a proof that the memory is valid. If you want to use the memory, you need to pass the proof as well. Finally, the proof must be disposed and the only function that can do that for memory is free. This can be done for the reminder example as well.


> Rule 4: When we move out of a variable, we can't use it anymore.

> C++ doesn't have rule 4, in C++ the compiler lets us accidentally use the variable after we move out of it.

Why is it so? Moving is not a legacy operation, why did they allow to use moved-out variable?


C++ move semantics was added to the existing standard library and class syntax in a way that avoided breaking old code. The language added "move constructor" conventions and the like, rather than making more drastic internal changes.

So moving from a std::string variable means it's still a valid object, only its internal state is now owned by someone else, and the std::string is required to now be in an "empty" (but still valid) state. Most of the "moving" is actually handled by extra code in std::string rather than the core language having rules for how the underlying bytes are moved around, because the core language wasn't changed that much.


Great article, but the OCD is still twitching from that opening line - “once of those concepts”


Fixed, thanks =)


Obligatory link on memory safety without garbage collection: https://devblogs.microsoft.com/oldnewthing/20180228-00/?p=98...


i think Rust is actually encouraging programmers to make extra copies of objects, because that's the most simple path to get things done.

Did I miss anything?


That it's often similar in C/C++, except instead of the borrow checker being in the way it's fear of memory management or concurrency bugs.


still, in c++ you would pass an object by reference to a function call (if that object is not expected to be of very small in size).

However in Rust you do have a strong incentive to do otherwise.


Is this hypothetical or can I actually use ^ in c?


^ actually is a language symbol of Apple's C compilers, used for a different purpose (lambdas): https://en.wikipedia.org/wiki/Blocks_(C_language_extension)


Microsofts C++ extension is probably more relevant here:

    Foo^ foo = ref new Foo();
https://en.m.wikipedia.org/wiki/C%2B%2B/CX#Objects


It's hypothetical. The use of ^ for pointers comes from Pascal.


You can, but it is the XOR operator.

It says a bit above:

  If we were to craft a C-like language


You can use it in C++/CLI, basically C# with much worse syntax.


Now he just needs to look into parrot (the old perl6 vm) or pony, which do expand on single ownership and compiler-checked move semantics.

Not just memory safe, but also concurrency safe.

Concurrent Pascal was also similar.


I think Pony is one of the most interesting languages out there. I love actors, I love capabilities, I love whitelisting of C/C++ usage. Sadly no armv7 support, cross compiling itself is really hard too (never got it to work), the package management support is weird and deviates quite a bit from mainstream (IIRC having multiple versions of the same dependency is a feature ) and it lacks users.

I honestly would love to have a mix between rust and pony.


Too bad nobody was ever able to build a practical, concurrent programming language on top of parrot. (Nor any well-performing non-concurrent language, for that matter). And quite a few talented compiler writers tried (I'm thinking of Patrick Michaud and Jonathan Worthington at least, probably forget quite a few).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: