I assume that the mutex's memory is warmed, so the processor doesn't have to go to RAM. But, it does have to synchronize with any other processors/cores in the machine to prevent them from locking the mutex at the same time.
In a Intel Nahalem or Sandy Bridge system, this goes over the QuickPath Interconnect which has a latency of ~20ns. HyperTransport fills the same role in AMD systems, and probably has a similar latency, but I don't have numbers for that.
I'm basing this on this presentation, especially the architecture diagram at 2m 40s:
In a Intel Nahalem or Sandy Bridge system, this goes over the QuickPath Interconnect which has a latency of ~20ns. HyperTransport fills the same role in AMD systems, and probably has a similar latency, but I don't have numbers for that.
I'm basing this on this presentation, especially the architecture diagram at 2m 40s:
http://www.infoq.com/presentations/Lock-free-Algorithms