I actually don't think that's true. My understanding is that on x86, atomic inst...

JoeAltmaier · on May 31, 2012

Locks are often implemented using an xchg instruction, which is implicitely locked.

All processor's caches are committed/flushed for the affected cache line. So its correct to say other processors are slowed down. But it also in that sense IS a main memory operation, just not yours.

scott_s · on May 31, 2012

To be clear, then we agree that haberman was correct, and the value can be changed in cache.

JoeAltmaier · on June 1, 2012

Sure. It just isn't useful in cache. If it is a real lock, it has to be shared. Either the caches have to reconcile, or it has to go to main memory.

haberman · on June 1, 2012

Just because a lock is shared does not mean that it's contended.

For example, some multi-threading techniques attempt to access only CPU-local data but use locks purely to guard against the case where a process is moved across CPUs in the middle of an operation (thus defeating the best-effort CPU-locality).

JoeAltmaier · on June 1, 2012

But it has to be available to those multiple cpus, right? So it has to go down to memory and back up to cache. Contended or not.

haberman · on June 1, 2012

Maybe I'm missing something, but if the cache line is only in use by one CPU, I don't see why the value would need to be immediately propagated to main memory or to any other CPU's cache until it is written as part of the normal cache write-back.

gvb · on June 1, 2012

Correct. Typically the cache snoops the main memory bus. If a remote CPU starts a read on a cached memory location, the caching CPU sends a "stall" or "retry" signal to the reader, does a cache flush to main memory, and then lets the remote CPU proceed with the (now correct) main memory read.

scott_s · on June 1, 2012

It's useful for fields (not just locks; I'm thinking of lock-free algorithms that do a compare-and-swap directly on a value) that have low contention.

JoeAltmaier · on June 1, 2012

Even uncontended, the value has to be reconciled between cpus/caches. So it has to be plumbed down thru all the caches, then back up.

scott_s · on June 1, 2012

That's my point: uncontended values may not be in any other caches.