While Lock-free code gets you part of the way to efficient parallelism (by remov...

yosefk · on Dec 29, 2012

Actually I didn't claim that lock-free is more efficient - I explicitly said that it isn't necessarily more efficient, though I didn't discuss the reasons that you do discuss; I only said that "lock-based" is defined by the need to deal with blocking upon suspension - not that it's necessarily a bad property in any way, in particular in its efficiency impacts.

gregsq · on Dec 29, 2012

I think it's a little too general to recommend not using CAS, as at a low enough level it's useful, but cache lines, as you say, are probably the most important functional issue.

It's possible to implement algorithms independently of data order on the memory bus, by dependence on execution order, and can therefore be barrier free, but cache lines are synced. Complete lines can be dedicated to single loads, a case where performance increases by sacrificing cache effective size.

The different Sparc memory write systems were always useful for explaining the performance effects of cache coherency and memory barriers.

thrownaway2424 · on Dec 29, 2012

All this and more at http://www.1024cores.net/