> The trick is to use the allocations themselves as linked list nodes so you don’t have to waste any extra memory for tracking.
There was an article that I'm struggling to find—posted to hn iirc—recommending against this approach. IIRC there were two reasons stated:
* Efficiency due to CPU cache. Freed allocations might have been unused for a long time and thus out of cache. Writing the pointer to the linked list on free puts it back into cache—a whole cache line when you only need 8 bytes or whatever for the singly-linked list—possibly at the expense of something hotter. (I think there are special instructions on many platforms to store without adding to cache, but that's probably not wise either because a bunch of frees might come together and thus need to access a previous one's book-keeping info. Better to keep all the bookkeeping info together in a dense region so you use more of the cache line.)
* Debugging troubles on buggy applications. They're more likely to overwrite adjacent allocations and thus the memory allocator's book-keeping, resulting in hard-to-debug failures. (I recall not being super convinced of this reason—I think having external tracking is insufficient for making it easy to debug this kind of error. I think you'd want to make it possible for ASAN to work well, thus you'd want to implement its "manual poisoning" interface.)
edit: I found this tcmalloc issue talking about not wanting to use singly-linked lists because of cache effects when moving stuff from the thread cache to the central list. Kind of similar. https://github.com/gperftools/gperftools/issues/951
> The trick is to use the allocations themselves as linked list nodes so you don’t have to waste any extra memory for tracking.
There was an article that I'm struggling to find—posted to hn iirc—recommending against this approach. IIRC there were two reasons stated:
* Efficiency due to CPU cache. Freed allocations might have been unused for a long time and thus out of cache. Writing the pointer to the linked list on free puts it back into cache—a whole cache line when you only need 8 bytes or whatever for the singly-linked list—possibly at the expense of something hotter. (I think there are special instructions on many platforms to store without adding to cache, but that's probably not wise either because a bunch of frees might come together and thus need to access a previous one's book-keeping info. Better to keep all the bookkeeping info together in a dense region so you use more of the cache line.)
* Debugging troubles on buggy applications. They're more likely to overwrite adjacent allocations and thus the memory allocator's book-keeping, resulting in hard-to-debug failures. (I recall not being super convinced of this reason—I think having external tracking is insufficient for making it easy to debug this kind of error. I think you'd want to make it possible for ASAN to work well, thus you'd want to implement its "manual poisoning" interface.)
edit: I found this tcmalloc issue talking about not wanting to use singly-linked lists because of cache effects when moving stuff from the thread cache to the central list. Kind of similar. https://github.com/gperftools/gperftools/issues/951