My understanding was that allocators nowadays have cache pools per cpu or thread...

gsvelto · on Oct 11, 2022

They do and so does jemalloc in Firefox, however that lowers contention, it doesn't let you do away with synchronization. Consider this simple scenario: a thread allocates a chunk of memory from its per-thread pool, but the object is later released by another thread. You can't do that w/o synchronization around the pool. Per-CPU pools are even trickier because a thread might be moved around between different CPUs by the scheduler while it's allocating memory. So you need special facilities to implement those in user-space like restartable sequences: https://lwn.net/Articles/650333/

saagarjha · on Oct 11, 2022

Amusingly macOS also has this as SPI, it's used for the Objective-C runtime.

saagarjha · on Oct 10, 2022

You are correct, but there are still places where synchronization is necessary.