Tag Archives: mapreduce
Hadoop Map-Reduce Explained with an Example
This article represents key steps of Hadoop Map-Reduce Jobs using a word count example. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key steps of how Hadoop MapReduce works in a word count problem: Input is fed to a program, say a RecordReader, that reads data line-by-line or record-by-record. Mapping process starts which includes following steps: Combining: Combines the data (word) with its count such as 1 Partitioning: Creates one partition for each word occurence Shuffling: Move words to right partition Sorting: Sort the partition by word Last step is Reducing which comes up with …
I found it very helpful. However the differences are not too understandable for me