Professor Hellerstein essentially stresses the distributed nature of MapReduce:
" it works on 'shared-nothing' clusters of computers in a data center", "the MapReduce framework is a parallel dataflow system that works by partitioning data across machines"
To what extent does MapReduce leverage parallelism (on a single machine)?
Would "Distributed Programming in the Age of Big Data" be a more appropriate title?
Leveraging parallelism on a single multi-core machine has a definite resemblance to leveraging parallelism in a cluster of machines: memory access costs are likely to be non-uniform, for example (the cores on chip X are "closer" to memory region A than the cores on chip Y, etc.)
This paper examines Map/Reduce performance on multi-core/SMP machines:
" it works on 'shared-nothing' clusters of computers in a data center", "the MapReduce framework is a parallel dataflow system that works by partitioning data across machines"
To what extent does MapReduce leverage parallelism (on a single machine)?
Would "Distributed Programming in the Age of Big Data" be a more appropriate title?