Yes it’ll reduce latency, but doesn’t it also increase parallelism? A single-threaded program ought to improve overall, unless the extra overhead you mentioned dominates. A parallel program might improve or not.
I think if you wanted to do deferred destruction right, ideally you’d mod an allocator to have functions like (alloc_local, alloc_global, free_now, free_deferred) to avoid exhausting memory. Traits could make this ergonomic.
Also I admit I don’t understand why “you won’t have any backpressure on your allocations,” shouldn’t deferred destruction give you more backpressure if anything? I am probably confused.
> Also I admit I don’t understand why “you won’t have any backpressure on your allocations,” shouldn’t deferred destruction give you more backpressure if anything? I am probably confused.
I think the point is that, if the same thread is doing both allocation and de-allocation, the thread is naturally prevented from allocating too much by the work it must do to de-allocate. If you move the de-allocation to another thread, your first thread may now be allocating like crazy, and the de-allocation thread may not be able to keep up.
In a real GC system, this is not that much of a problem, as the allocator and de-allocator can work with each other (if the allocator can't allocate any more memory, it will generally pause until the de-allocator can provide more memory before failing). But in this naive implementation, the allocator thread can exhaust all available memory and fail, even though there are a lot of objects waiting in the de-allocation queue.
I think if you wanted to do deferred destruction right, ideally you’d mod an allocator to have functions like (alloc_local, alloc_global, free_now, free_deferred) to avoid exhausting memory. Traits could make this ergonomic.
Also I admit I don’t understand why “you won’t have any backpressure on your allocations,” shouldn’t deferred destruction give you more backpressure if anything? I am probably confused.