Again, generalisations about performance like that is just not very useful. Ther...

MillenialMan · on Oct 24, 2021

I don't really agree. Cache friendliness is almost always a relevant factor as soon as performance becomes an issue. I get what you're saying but as I see it immutability gives you architectural efficiency and in some cases space efficiency, but rarely processing speed.

> Think video games like braid

Braid doesn't use immutable data structures, it uses a historical record (and immutability would be incompatible with some of the mechanics).[1] The author of Braid is actually quite famous for his dislike of functional languages and immutability. He doesn't even like garbage collection because of the overhead it introduces.

Interestingly, he was talking about data structures for codebase representation (in the context of a compiler) a while back, and someone mentioned HAMTs. I'm definitely curious if they would work well there.

[1] https://www.youtube.com/watch?v=8dinUbg2h70

dgb23 · on Oct 24, 2021

You’re picking out an admittedly bad example in order to miss the point. Different data structures have different tradeoffs. Not every problem lends itself to the same type of optimization, so it’s not helpful to make these generalizations.

Continuing with the example regardless, change records and persistent data structures have different performance characteristics. The former is going fast if you move incrementally between states, the latter enables arbitrary access, comparison and efficient in memory caching of multiple views.

It would be interesting to explore and measure the trade offs under certain use cases.

MillenialMan · on Oct 25, 2021

I understand your point. I'm saying: the subset of problems that benefit in a performance sense from immutability is very small. The vast majority of the time, cache misses slow down algorithms. That's a perfectly reasonable generalisation and I don't really understand why you think it's untrue.

Re: change records, I believe Braid uses a historical record of absolute states, not a series of diffs. The state at a particular time is recreated from a minimal representation (a couple of arrays). That's much more efficient than throwing multiple iterations of the state in a HAMT.

dgb23 · on Oct 25, 2021

> The vast majority of the time, cache misses slow down algorithms. That's a perfectly reasonable generalisation and I don't really understand why you think it's untrue.

I don't say it is untrue. But not every use case lends itself to CPU cache optimization. Your access patterns might just happen to be arbitrary or fragmented from a layout perspective.

I would argue that this is a very common case, especially for IO bound applications that operate on tiny pieces of data and have deeply nested dispatch rules that each query some section of memory that you don't know it will query in advance.

Or in other words: The clearer your computation pipeline can be, the more you can benefit from CPU cache optimization.

> Re: change records, I believe Braid uses a historical record of absolute states, not a series of diffs. The state at a particular time is recreated from a minimal representation (a couple of arrays). That's much more efficient than throwing multiple iterations of the state in a HAMT.

You caught my interest, I will have to study this. I can't really imagine how it works from your explanation (my bad, not yours). I assumed when you said "change records" it meant that it would be suboptimal to access an arbitrary state (linear as opposed to constant).