Hacker News new | past | comments | ask | show | jobs | submit login

This is pretty cool! As they mention in the paper this is similar to biased locking in java where “synchronized” blocks only ever get promoted to a full mutex when a second thread tries to lock it. It’s the same idea here except with a counter.

This made me think of two other approaches I wonder if anyone tried. The first is using something like per-cpu counters. The number of threads is usually large, but maybe the number of cpus is bounded enough to contain the memory pressure? Linux implements restartable sequences [https://blog.linuxplumbersconf.org/2016/ocw/system/presentat...] for this but I don’t know if that’s any faster than BRC described here. The other thought was - in a similar light to the static analysis the paper mentions - is there any benefit to somehow marking the objects that are “about to escape” to switch the owner thread id? One of the common patterns is swift is bouncing a sequence of tasks off different dispatch queues to get on/off the UI thread, but its really a sequence of operations on the same object.

Great stuff in either case - I’m a fan of no-effort memory management with minimal performance implications and this is one step closer.




I played with per-thread counters. The problem is you don't want them sharing a cache line, so having them inside the object is problematic. Having external per-thread counter arrays involves bookkeeping and indirection I could not make fast enough. YMMV, and maybe someone has done a better job.

A known variant is deferred counting, where a single thread mutates refcounts and you have per-thread lightweight queues from other threads to the mutator.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: