Introduction to MapReduce and Hadoop - Computer Science Division | EECS at UC Berkeley
What is MapReduce? • Data-parallel programming model for clusters of commodity machines • Pioneered by Google – Processes 20 PB of data per day ... What is MapReduce used for? • At Google: – Index building for Google Search – Article clustering for Google
www.cs.berkeley.edu |