Map Reduce in Hadoop

Posted on August 12, 2015 by By admin, in Big Data, Business Intelligence | 0

Map Reduce in Hadoop :

It is designed for processing large volume of data in parallel.

It is an execution model in hadoop framework which is sub-divided into two separate phases :

Mapper phase,
Reducer Phase

Mapper Phase : During this phase , the input data splits for analysis by map tasks running in parallel across the hadoop cluster. It separate required output key and output value and writes into local disk.

Reducer Phase : It has two responsibility :

Grouping the data based on key
Aggregation

Once output is returned , immediately mapper output will be deleted.

In our query or job,if suppose there is no requirement of grouping and aggregating functionality , we can suspend the reducer . In such situation , mapper output is permanent.

For the mapper and reducer , input and output should be in key value pair.

Identity Mapper : it is like identity function in mathematics.

Identity Mapper takes the input key/value pair and splits it out without any processing.

Identity Reducer :

In Identity Reducer , the reduce step will take place , related sorting and shuffling will also be performed.but there will be no aggregation .

So if we want to sort our data that is coming from map but don’t care for any grouping and also fine with multiple reducer output then in that case we can use identity reducer.

Combiner in Map reduce :

Combiner is used as an optimization for map reduce job.The combiner function runs on the output of the map phase and is used as filtering or aggregating step to lessen the number of intermediate keys that are being passed to the reducer.In most of the cases , the reducer is set to be the combiner class. The output of the combiner class is the intermediate data that is passed to the reducer where as the output of the reducer class is passed to the output file on disk.

Thanks,

Rupam Bhardwaj

Best Open Source Business Intelligence Software Helical Insight is Here

A Business Intelligence Framework

Big Data Hadoop MapReduce

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

24 Dec 2024: Helical IT Solutions is excited to unveil Helical Insight 5.2.2, the latest iteration of its cutting-edge Open Source Business Intelligence (BI) platform. This release reinforces Helical Insight's position as a cost-effective, versatile, and powerful alternative to mainstream...

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

02 Sept 2024 – Helical IT Solutions is thrilled to announce the release of Helical Insight version 5.2.1, the latest upgrade to its Open Source Business Intelligence (BI) platform. This new version delivers a powerful, cost-effective BI solution that is...

Business Intelligence

Installation of Firebird db

By admin

Steps to install firebird db 1. Go to google and type firebird in search box and then click on first link. License aggrement 2. Click on downloads and then install Firebird latest version(5.0.0). 3. It will navigate to the below...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Map Reduce in Hadoop

Posted on August 12, 2015 by By admin, in Big Data, Business Intelligence | 0

A Business Intelligence Framework

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

Business Intelligence

Installation of Firebird db

By admin

Contact Form