Leveraging parallelism on a single multi-core machine has a definite resemblance...

Leveraging parallelism on a single multi-core machine has a definite resemblance to leveraging parallelism in a cluster of machines: memory access costs are likely to be non-uniform, for example (the cores on chip X are "closer" to memory region A than the cores on chip Y, etc.)

This paper examines Map/Reduce performance on multi-core/SMP machines:

http://csl.stanford.edu/~christos/publications/2007.cmp_mapr...