Consider the below diagram
Take the sample Wordcount example, where the most of the words has been repeated for half a million or more times.
In that case after the Mapper phase, each mapper output will have words in the range of half a million.
While transferring the data from Mapper to S & S, due to network bandwidth / any connection issues, one or more mapper output (Mapper 2 output in above case) did not reach to Sort and Shuffle, then the complete MR job will be failed.
To avoid this situation, Mapper output will always be stored in Local File System(LFS) till the MR Job completion. If because of any of the issues mentioned above, mapper output did not reach to Sort and Shuffle phase, then stored output in LFS will be taken and resent to the S & S phase.
Points to be remember
- This phase is only for Mapper in Map Reduce.
- It is NOT available for storing Sort and Shuffle output or Reducer output.
- Life of the Mapper output is till the end of the job completion i.e., as the job completion success or failure, the Local copies of mapper output will be automatically be revoked by Mapper only.
Certification Note: Mapper output copies will always be stored in LFS.
Thanks for providing this informative information you may also refer.
ReplyDeletehttp://www.s4techno.com/blog/2016/08/13/installing-a-storm-cluster/
You have provided an nice article, Thank you very much for this one. And i hope this will be useful for many people.. and i am waiting
ReplyDeletefor your next post keep on updating these kinds of knowledgeable things...
iOS App Development Company
Android App Development Company
Best Mobile app Development company
Android App Development Company in chennai
iOS App Development Company in chennai
hanks for sharing such details about bigdata and hadoop. Big data hadoop online Course India
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteIts great information on hadoop, nicely explained by you. Thanks Hadoop Big Data Classes in Pune
ReplyDeleteNice blog thanks for sharing this content.
ReplyDeleteBig data Hadoop Training
Thanks for sharing such a great blog Keep posting..
ReplyDeleteHadoop Training in Delhi
Hadoop Training institute in Delhi
very nice blog...I will definitely follow your blog in future
ReplyDeleteHadoop Online Training
Hadoop Training
Hadoop Training in Hyderabad
Hadoop Training in Ameerpet
Good post!Thank you so much for sharing this pretty post,it was so good to read and useful to improve my knowledge as updated one,keep blogging.
ReplyDeleteBig Data Hadoop training in Electronic City
Thanks for sharing such a great blog Keep posting..
ReplyDeleteHadoop Training in Delhi
Hadoop Training institute in Delhi
Good article! It is very inspiring and informative, This article is worth sharing to other people too. We are looking forward to more of this.
ReplyDeleteData Science training institutes in marathahalli
Spark Training in Marathahalli
very nice article,thank you for sharing this awesome articlw with us.
ReplyDeletekeep updating,...
big data and hadoop training