The difficulty in Rust is reflecting how there's not much difference between a c...

dataflow · on Feb 23, 2018

Fun fact: The `unique_ptr` approach you mention is also broken (and I would go so far as to say it's wrong entirely). The destructors will blow the stack. I really do think std::list's approach is the only correct solution for a generic linked list.

dbaupp · on Feb 23, 2018

Using the default destructor is broken, yes, but the approach isn't: one can avoid it in the list's destructor using a loop that walks over the list to not let nodes be destroyed recursively (i.e. clearing the next pointers).

    node = head
    while node:
        next = transferOwnershipAndClear(node.next)
        node = next

Where the third line would be 'std::unique_ptr<...> next(std::move(node.next)); ' in C++ that uses unique_ptr<...>, or 'let next = node.next.take()' in Rust that uses Option<Box<...>>.

Of course, you could definitely argue that there's little point using unique_ptr if you're still having to write a destructor for the list itself.

dataflow · on Feb 23, 2018

You can keep putting bandages over it, but the approach itself is fundamentally broken. It fundamentally intertwines the memory management with the data structure management when the two really are pretty orthogonal to each other. For example, if you ever want to use a different allocation function, you'll run into trouble. How do you get the right function to deallocate each node, and at what penalty? Does every node keep a pointer to a common allocator now? Or, for example, if you have a move constructor/assignment like that, now you're declaring that it makes sense to "move" one node onto another, even if they use different allocators. But does it really? Their ownership simply doesn't have anything to do with their sibling relationship, and you're forcing them to be tied together. Like, yeah, you can keep putting bandages on top of this, rub some alcohol on it, and giving it crutches to make it work, and I'm sure you'll eventually make it work, but the right thing to do is to just step back and realize the flaw is in the approach itself: some concerns are global across the nodes (like memory management), and some are local (data structure management), and hence it doesn't make sense to mix the two.

(Oh, and did I point out that all of this means you'll have to find a different solution when the data structure is no longer linear to allow a destructor hack like that? The fact that the scope of the approach is limited by a property that really isn't all that relevant to the actual problem is another sign that it isn't the gift one.)

dbaupp · on Feb 23, 2018

I am not advocating for it being a great way to implement a linked list, just using it as an example that can be easily understood in a throwaway comment.

As an alternative:

  # Delete all the elements from node 'from' to (but not including) node 'to'
  erase(list, from, to):
    from.prev.next = to
    to.prev = from.prev
    while from and from != to:
      from = from.next
      list.allocator.free(from)
    # update 'tail' if necessary

  destroy(list):
    node = list.head
    while node:
       next = node.next
       list.allocator.free(node)
       node = next

This code is also wrong. Calling erase(list, list.tail, list.head) or similar will create a loop, and then destroy will loop forever/read dangling pointers. This is also very close to correct (I believe 'destroy' looking for node == tail instead of node being null would avoid the loop), but isn't.

Yes, it's easy to write a linked list, but it's also easy to screw up in subtle ways, and that line is thin. The safe subset of Rust restricts what is legal to be able to get some sort of control/understanding on pointer-soups, so that it can verify and validate that problems like the above don't turn into really bad problems. (The equivalent thing couldn't happen with vector-and-indices or arena-allocated nodes, and would result in the less dangerous[1] problem of an infinite loop/memory leak with Rc/Weak or garbage collection.)

[1]: A denial of service is better than remote code execution (proof: the latter can be turned into the former).

dataflow · on Feb 23, 2018

So now your rebuttal to me pointing out of this design flaw is that it is possible to write a buggy implementation even in the absence of such design flaws, and that Rust can help you avoid those bugs too. I'm not sure entirely what else I'm supposed to say in response, but yes, that is a true statement.

dbaupp · on Feb 23, 2018

I'm arguing against your top-level original comment; I have no attachment to any particular scheme for implementing linked lists, they're just examples of things that are almost right, but are wrong enough to result in memory corruption.

My thesis is "it does get tricky to implement a linked list without garbage collection".

My original example was simple to avoid having to work through a longer example. Now that I reread, I agree that that particular example isn't relevant.

My second example was demonstrating that even a list fitting into your proposed ownership scheme (which, you say, "doesn't get tricky") does get tricky. You apparently agree that this is problematic, so you also apparently agree that it is tricky to implement a linked list.

Rust rejects most attempts because of the risk of memory unsafety, and it's hard to convince it otherwise, because there's little difference between 'safe' and 'unsafe' in a pointer soup.

dataflow · on Feb 23, 2018

Oh, so all you're arguing against is my statement that "it doesn't get tricky"? In that case I think you're somehow missing the entire point of these discussions. I didn't make that statement in the abstract; it had some context behind it. The thesis of the article (and hence the basis for my comment) was that linked lists are tricky in Rust -- and presumably not (as much) elsewhere, or Rust wouldn't the focus of the article. I am saying, no, that trickiness only comes about because he's insisting on the wrong design, and if he took the same approach as in C++ (or C#) he would no longer encounter it. That's what "it doesn't get tricky" means. If you pull it out of context and make your thesis that "it does get tricky to implement a linked list without garbage collection" in general, then okay, yeah, sure... linked lists are tricky, pointers are tricky, programming is tricky, life is tricky, etc... but then we get nowhere since all of those are entirely missing the Rust context behind the discussion.

dbaupp · on Feb 23, 2018

No, I am disagreeing with your explicit disagreement with "implementing a data structure like this in a non-garbage collected language without Rust is also quite tricky".

Now that I reread for a third time, I suspect you may've picked up on the "a data structure like this" to mean "a linked list with this exact ownership arrangement", whereas I interpreted it as the looser "a data structure like a linked list". That puts the rest of your comments into more context, and sure, I grant you that it's a suboptimal choice, but I don't think it's what the author meant.

In any case, I think I was rather assuming too much Rust context: taking your suggestion does not resolve the trickiness in Rust (as in, you won't be able to convince the compiler to accept it without using `unsafe` or reference counting). It's a very common meme about Rust that there's no way to write a safe C++-stye linked list with pointers, something independent of the choice of ownership scheme. "Fixing" the design doesn't actually help: it's still way too much of a pointer soup.

dataflow · on Feb 23, 2018

> No, I am disagreeing with your explicit disagreement with "implementing a data structure like this in a non-garbage collected language without Rust is also quite tricky". [...]

I see what you're saying, and again: it's missing the entire baseline and context for that claim. My thesis was that if he had known/considered/used the actual std::list design (which I assume he hadn't, or he wouldn't have proposed nodes that owned siblings), he would not have considered linked lists to be tricky in non-GC'd languages according to whatever his baseline is for that (presumably, tricky enough to blog about it). But somehow you simply extracted my reply with that one quote I replied to, discarded all the rest of the context (his blog post and my comment and all), lowered the baseline for "tricky" (from ≈"tricky enough for him to call it 'tricky' and blog about it" to ≈"possible for the average programmer to write an initially-buggy implementation thereof"), and then ran away with this beautiful straw man to refute. =P Except that (obviously, I had thought) wasn't my claim in the first place...

dataflow · on Feb 23, 2018

Oops, typo... s/gift/right/

pjmlp · on Feb 23, 2018

I think the C++ example is safer modeled like this

    class ListNode {
        std::unique_ptr<ListNode> next;
        std::weak_ptr<ListNode> prev;
        // some data or whatever
      };

dbaupp · on Feb 23, 2018

That doesn't work. std::weak_ptr is a reference counted pointer matched to std::shared_ptr; there's no concept of weak pointers (i.e. non-owning with dynamic checks) for std::unique_ptr, which corresponds almost exactly to Rust's Box.