Hacker News new | past | comments | ask | show | jobs | submit login
Mistakes with Rust smart pointers: when Deref goes wrong (fuzzypixelz.com)
205 points by xy2_ on July 10, 2023 | hide | past | favorite | 61 comments



I think I've hit the same kind of problem once. What I learned is that true smart pointers are types that (at least in Rust) aren't really supposed to have methods on their own, to avoid this type of ambiguity during method resolution.

For example, Box<T> implements Deref so that you can conveniently use T's methods. If you look at the documentation of Box, the things you can do with the Box itself, like Box::leak, are non-method functions. This means that you always have to qualify them, like Box::leak(my_box), and thus you can't get this type of conflict.

Using Deref for types like String or Vec doesn't really fall into this category. They implement something more analogous to an "is-a relationship" in OO. Even outside of those examples, I've seen people manually implement inheritance-like behavior, using Deref to delegate to a field that stores the "base class" instance.

If you do that, you're likely to have a name conflict between the two types sooner rather than later. That's fine as long as you're aware that Rust doesn't really do overload resolution -- AFAICT, the "outer" methods always wins regardless of the arguments at the call site.


Deref for Vec<T> and String is fine because they are owning containers that decay to their "view" types (slices). Strings and vectors own memory, while views borrow it and can be used to operate on them without reallocating or moving the container itself, treating it as a contiguous list of values. A similar relationship exists between C++'s `std::string` and `std::string_view`, and `std::vector` and `std::span`.

IMHO there the logic behind that is very clear and makes a lot of sense. I think it's natural that view-related traits (the ones who operate on the elements of a contiguous area of memory) should be associated with slices, while memory-related ones (the ones that operate on an owning container) should be implemented on the owning types. If you mix them up, you are bound to mess up first and foremost because _it is not logically sound_. You don't need to own a string in order to check if all of its ASCII characters are all caps, for instance, because it's an operation that's also valid on any array of bytes, without the whole concept of "owning memory".


> You don't need to own a string in order to check if all of its ASCII characters are all caps, for instance, because it's an operation that's also valid on any array of bytes, without the whole concept of "owning memory".

Interestingly you can ask whether a string is ASCII, "this".is_ascii() is a predicate which does exactly that, but for what I think is your question, "are they all ASCII caps?" you'd need to go via an iterator and a lambda: "THIS".bytes().all(|b| b.is_ascii_uppercase())

The iterator is because there is no provision of each such predicate over the whole string, which seems reasonable (the character class predicates are provided on u8 (ASCII only), and on char (all of them)), but the lambda is due to Rust's 1.0 compatibility commitment plus an unfortunate historical choice. If not for compatibility we'd fix it so you could just write "THIS".bytes().all(u8::is_ascii_uppercase)


Am I correct in thinking this is the type of thing change that fits the criteria to be implemented in a new edition? Assuming the issue is the same one I often run into (i.e. 'a versus for<'a>), it seems like special casing this for closures passed as arguments would be quite nice, but the fact that it wasn't done for 2018 or 2021 makes me think that maybe it's not something that would qualify for introducing in this way.


It's not "just" syntax, so it couldn't "just" be a new edition, but I suppose it's conceivable that a hack could be invented and then the old behaviour grandfathered into prior editions as was done for impl IntoIterator for [T; N] in 2021 Edition.

That's a heavy lift though, you may need to carry this special code around in every compiler, forever, and I haven't even figured out what such code should look like, it might be really hard to even write it.


Good to know! I appreciate the insight


How would removing the lambda affect back compat?


Thank you article, parent, GP. You've saved me just in time from never having read the Deref documentation properly. I have some code to delete before I make this mistake again.


> AFAICT, the "outer" methods always wins regardless of the arguments at the call site

I imagine this is explicitly state in some portion that I have never read of the Deref documentation. The thing is, every introductory text ignores this, and there is no itemized rule in a more global document so that things like Rc and Cell can point directly to it.

The Rust docs have a very general discoverability problem, and this is one of its facets. Since things have a very generic and reusable definition, the docs have no good place to put information like that.


It definitely would make sens to include this in the docs for `Deref`. I suspect the reason people hadn't thought to do that yet is because there's nothing specific to `Deref` about this behavior; it works the same whenever methods are present on a type due to trait definition[0] (although I imagine it's potentially less confusing for other traits given that the methods they provide are explicitly listed rather than implicit via the associated type).

It's also worth noting that if the ambiguity is between two trait impls rather than the concrete type and one of its trait impls, the compiler does require explicit disambiguation (either by calling the method as a top-level function on the trait and passing in the `self` parameter or by casting before calling the method)[1].

[0]: https://play.rust-lang.org/?version=stable&mode=debug&editio... [1]: https://play.rust-lang.org/?version=stable&mode=debug&editio...


This feels analogous to hiding in C++. In C++ by default member functions in a base class don't overload with member functions of the same name in a derived class. They get hidden.


Not disagreeing with you but what's interesting is that C++ doesn't hide in this particular situation. Smart pointers implement operator-> and operator* but then ptr.foo() is still always the foo() method of the pointer type (and won't compile if it doesn't have it) - you need ptr->foo() or (*ptr).foo() to get the method of the pointed-to type.

It's one of the few places that C++ is more explicit than Rust. I wonder if there's a reason Rust went a different way? I'm mainly a C++ programmer so I'm genuinely curious.


One of the downsides of the C++ approach is that it makes it more cumbersome to make changes in your code. If you for instance make a trivial change where you change a variable from a stack-allocated object accessed directly via .foo() into a heap-allocated one accessed via ->foo(), you have to either manually change all .foo()s into ->foo()s or do some sort of weird work-around, such as making a separate variable that's a reference to the heap allocated object so that you can keep using those .foo()s. This is all relatively pointless busywork, which also pollutes the commit history with relatively meaningless changes.


those changes in access are semantically different, so i prefer being forced to change code when semantics change


There is no sematic difference between passing in a reference and passing in a copy of a smart pointers though.

Yes there is a difference at the call site between taking a reference to an owned thing and taking potential ownership of the thing.

However inside the function block it is noise. Nothing inside the code is semantically different between the two forms, both do work on preexisting data.

Certainly the second version could have a change where it does something with respect to the fact it is a smart pointers, changing what used to be a grabbed pointer to a weak pointer to be more explicit about lifetimes for instance.

However that code change would be just as explicit as long as there aren't any methods on the smart pointer in the auto deref world.


I've always liked that feature of C and C++. There's a lot of people that really hate it and in new languages they unify it to be just a dot. Zig does this as well, for instance. I always liked that the distinction was very explicit: use a.b() if you're calling b() on the object itself, use a->b() if you're de-referencing a first.

There are lots of things wrong with C and C++ that Zig and Rust fixes, but I never thought of this one being problematic in any way. It's as easy to type, the difference is obvious, and it's easy to see from a glance what's going on.


I'm among the people who would have liked to eliminate even this special case and have a convenient postfix derefence for this. So instead of a->b() you'd have a*.b(), and if you need to dereference again, write a**.b() and so on. But as you said, lots of people really seem to hate this kind of thing. Zig is almost there with its dereference syntax, but still kept the magic dereferencing dot.


Object Pascal strikes again: if a: ^record ... end is a pointer to a record, then a^ is the pointed-to record as a value and a^.b is the way you access a field of it.

Incidentally, the authors of Go were asked why they didn’t use a postfix dereference when Sethi noted[1] back in 1981 that C syntax would have been simpler that way, was quoted in Ritchie’s 1993 retrospective[2] on that point, and even their own blog post on declaration syntax pointed out that possibility[3], but IIRC the answer amounted to “but then we wouldn’t have gotten those sweet, sweet C-bandwagon popularity points”.

(I also cannot help calling out “Language X” here: One nice approach that nevertheless basically forces implicit dereferencing is that of Algon-68, where all values are constant values; the way to declare a mutable variable is to declare a pointer—a constant value—to a mutable cell; and the way to make the whole thing bearable is to have the compiler infer the minimum possible amount of dereferencing in an expression from the types of the values involved. Then of course it becomes impossible to infer the types, so the ML family goes with the inconvenient dereferencing, and you end up not really using a lot of mutability in ML.)

[1] https://dx.doi.org/10.1002/spe.4380110606

[2] https://dx.doi.org/10.1145/154766.155580

[3] https://go.dev/blog/declaration-syntax


I have limited Zig experience, but actually zig is different from rust or go in this regard, correct?

Rust and Go have the pointer have the unified syntax, but I thought in zig it does differentiate between the pointer itself and the value it points to.

Rust and Go:

ptr.method_on_ptr() ptr.method_from_underlying_val()

Zig:

ptr.method_on_ptr() ptr.*.method_from_underlying_val()


I'm mostly indifferent to it because it doesn't really harm readability, and it's not hard to know when to use it, but I can seem why it might be more strongly disliked. You can't entirely rely on it to know a dereference is happening because references (e.g. `int&`) aren't subject to the `->` requirement. It's also annoying if you find that you can refactor `func(T*)` to `func(T&)` and now you have to replace all `->`s with `.`s.


> You can't entirely rely on it to know a dereference is happening because references (e.g. `int&`) aren't subject to the `->` requirement.

That's true but I think the bigger motivation is avoiding ambiguity than seeing when an indirection is happening. (You also can't see whether a method is virtual, i.e. indirecting via the vtable, from the call site.) In C++ references don't have any standalone methods or operators so there's no ambiguity from using methods via a reference just using a dot.


> You can't entirely rely on it to know a dereference is happening because references (e.g. `int&`) aren't subject to the `->` requirement.

To the extent that this is a problem, it's a problem with references as a language feature in general, not with the arrow syntax.


People have been trying to get 'operator.' overloading in C++ for the last 20 years. Unfortunately this confusion between pointer operations and pointee operations has always been an obstacle :(.


"The issue, it turns out, is that method search does not check method parameter types against argument types."

Right. Rust's doesn't allow much overloading of function names. Partly because C++ did. The C++ overload resolution rules got very complex, especially since they interact with implicit conversions. To keep those rules from introducing errors, there's a rule in C++ that the overload chosen must be at least one step "better" than any matching alternative. Something that was supposed to make thing simpler thus became very complicated.

So Rust makes the programmer write that stuff out, which is more verbose but less confusing.


I worked on a C++ codebase where there was a method that had two overloads - one with 36 char* arguments, and one with 37. Obviously one hardly knows where to begin pointing out the problems. I'd like to report that it was some sort of elaborate joke, but honestly not sure. I think Rust, as in so many areas, made the right call here.


I'm currently trying to figure out who calls this C++ method in a large code base:

    bool operator()(const LLDispatcher* dispatcher, const std::string& key
    ...
Yes, they actually overloaded "()".


I don't follow. How does this declaration overloads "()"?


Yes, I've had equally confusing compile errors from C++ due to its lookup rules. For example removing an unused parameter from a function can cause that function not to be found anymore! I'd argue that is more confusing than this Rust issue.


What does this comment mean? Function parameters used or not are part of the function signature, so obviously would participate in any form of argument dependent lookup.


I mean, you can do as small a change as adding a const somewhere, and cause an (almost) independent file somewhere else in your codebase to hit the wrong overload.


Perhaps a case where after removing the unused parameter and updating all the call sites, none of the calls resolve to the original function anymore.


And we would expect this behavior why?


If you start with

    void foo(std::string x, int y);
    void foo(const char* x);
Then calls to foo("bar", 0) would resolve to the first overload: the second isn't a match at all, but the first is a match with an implicit conversion from const char* to std::string. But if you removed the second parameter, including arguments at all call sites, then foo("bar") would match the second better because it doesn't involve any conversions, so it would be selected instead of the first.


There was an argument that was not used by the function implementation. I removed it from the signature entirely - at the definition and the call sites.

Boom. No longer compiles.

Yes I understand the complex and surprising rules that cause this to happen. That doesn't make them any less complex and surprising.


"not used by the function implementation" doesn't mean "not passed in by callers".

The callers won't find a matching signature once there is 1 less parameter. I don't see what is confusing about it.

You probably misunderstood that warning as "No callers pass this parameter", which btw would only make sense for a parameter with a default value.


I removed the parameter from the definition and call sites.

I went from

    void foo(A a, B b) {
       // Only use a
    }
    ...
    foo(x, y);
To this

    void foo(A a) {
       // Only use a
    }
    ...
    foo(x);
And it stopped compiling. There were no other functions called `foo` so it is nothing to do with overloading.

I'll give you the answer if you want or we can leave it a mystery so you can experience how confusing that behaviour is...


Sigh. This is entirely expected behavior, and in fact, is _necessary_ behavior. If changing signatures in that way continued to allow code to "compile" with reckless abandon, so many things would break. I think before complaining, you need to understand how general symbol resolution, name mangling, and argument dependent lookup work. And if you're too impatient to learn, you're not actually in a position to criticize.


All I can think of is Koenig lookup [1]: B and foo were defined in a namespace, and the calls to foo were not in that namespace and didn't explicitly qualify foo but still found it.

[1] https://en.m.wikipedia.org/wiki/Argument-dependent_name_look...


Yes I’d like to know the answer.


Perhaps the argument had a default value so call sites didn't need to mention it. Although I can't think of removing an argument with a default value would change its priority in the overload set. But I definitely don't know all the rules so I can believe that it could!


Deref polymorphism is a known anti pattern.

https://rust-unofficial.github.io/patterns/anti_patterns/der...


Yep. The docs for Deref also dedicate a whole paragraph, with a conclusion in bold, near the start of the page, to why it's a bad idea. https://doc.rust-lang.org/std/ops/trait.Deref.html


Couldn't one equally see that as a footgun, given the lack of great solutions for something like this example?


> Moral of the story: don't implement Deref kids!

One of the reasons for Deref (ab)use is that Rust doesn't have a good story for delegation for newtypes yet...


As a side note, rust does not have inheritance, at not point did `AsciiString` inherited the `Index` implementation through `Deref`.

What happened is that in many usages the compiler would insert an implicit deref which would lead to an `Index` implementation being available.

But this does not work in all cases. Not just the case mentioned in the article but many others. The simplest is that it doesn't implement the `Index` trait and hence any generic bound requiring `Index` will fail if you try to use it with a `AsciiString`.

So it important to _not think of `Deref` as a form of inheritance ever_ nor use this as a oversimplifications when teaching rust or similar, it will only lead to a lot of confusion.

Additionally smart pointer in rust are "pointer like" which means you don't expect them to have methods by themself or anything like that. Hence why special methods on smart pointers are implemented as static function on the type, instead of methods.

Through `Deref` can also idiomatically be implemented for types with a owned/borrowed duality (e.g. `String`/`&str`) so `AsciiString/AsciiStr` would work but `AsciiString/str` won't be ideomatic.

Then if you idiomatically implement traits on this types you won't run into the issue described in the post, as you implement all you `Index` types and similar on the `&AsciiStr`/`&mut AsciiStr` etc.

Anyway there is a way to get a reference to a different kind of type, e.g. getting a `&str` from a `AsciiString`. It's `AsRef`. But that is not automatically applied like `Deref` for the reasons many reasons including stuff like mentioned in the Article.

Lastly a better way to think about why it doesn't work is that the type resolver looks if the type is implemented on your concrete type in some form and only if it isn't it will potentially dereference. And due to technical details of how type resolving reliable in multiple direction can be tricky `Index<T>` does count here as "the type is implemented" even if the `T` doesn't match.

The articles conclusion of not implementing `Deref` (in most cases) is something I can fully agree with.


Great article, and pretty unusual for Rust to silently start doing the unexpected thing.

From the start, I wished that Rust had used different punctuation for method calls on a `T` and a `SmartPtr<T>`. For instance, `my_t.f()` but `my_boxed_t->f()` (or get rid of auto Deref and just require `(my_boxed_t).f()`). For subscripting it could be something like `my_vec->[index]` (or just require `(my_vec)[index]`. Just something there to remind you that you are in fact using a smart pointer. This would also let you distinguish between `box.leak()` and `box->leak()` and not require `Box::leak(box)`.


In case the author happens to see this thread – there seems to be a missing less-than sign:

    str::from_utf8( as Deref>::deref(&self.0))
Thanks for the article! ^_^


You're welcome :)

I finally found out what was going wrong here. All angle brackets and other characters were being escaped in other parts of the document, even in <code></code> blocks. Except when it came to code blocks inside <pre></pre> blocks.

For context, I use Zola and Cloudflare pages.

It turns out there is a bug[^0] in Zola 1.40.0 which reads:

> Fix code blocks content not being escaped when not using syntax highlighting

I presume that this is the culprit, as I use an external syntax highlighting library.

Cloudflare Pages has been famously sluggish with updating the Zola version. For reference, 1.14.0 was released in July 2021.

This explains why everything was working fine on my machine™.

Thanks for pointing this out, though I unsure how to work around it without changing my setup.

[^0]: https://github.com/getzola/zola/blob/master/CHANGELOG.md#014...


I'm using Zola and Cloudflare Pages, and it is currently possibly to use ZOLA_VERSION=0.17.1. I thought I had opted into the Pages v2 beta (I remember joining a Discord for this purpose a while ago), but on the actual page the setting is at v1.

You can change this under Settings > Builds & deployments > Build system version. Cheers!


Thanks!

Apparently you can now ZOLA_VERSION to _any_ version you desire in v1 and the build system will install it on demand. I didn't realize that part had changed.

On the other hand, v2 doesn't even have Zola; my builds failed with a "failed to find zola command".

I guess we're now set for life, and will never have to wait for Cloudflare to update the image in order to bump the version. Less headaches for everyone.

Source: https://developers.cloudflare.com/pages/platform/language-su...


For the record, I have noticed two other places where the angle brackets are missing:

    struct AsciiString(Vec);
and

    impl Index> for AsciiString {


> the idiomatic course of action would be to use a newtype to wrap a Vec<u8>

But isn't that wrong already? The element type `u8` is not an ASCII char. We need a subtype of `u8`, like `u7` if that existed.


>Rather, it only checks for method names and where-clauses such as T: Trait. Types are only resolved in the confirm phase, at which point rustc would have already picked its method.

Not following here, the compiler iterates over the deref-chain and acquires a list of method names, but it also "checks for ... where-clauses", what does that mean? A quick look at the linked probe code indicates what we filter on Self traits and return value traits.


Typo: `impl Index> for AsciiString`


There are a bunch of typos and missing generics all over the place, looks like something there doesn't like angle brackets.


Possibly an HTML-transparent toolchain, something like the original markdown use case would likely have this sort of effect (as HTML transparency was a feature to Gruber).


This is very misleading:

str::from_utf8( as Deref>::deref(&self.0))


That was a good read


Great post, and great links; thanks!


Wow, if I were you, I would never think of analyzing the rustc source code. Great job!


Explicit is better than implicit.

And I totally agree with the conclusion of the article: don't implement Deref yourself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: