Install Hadoop on Windows

Posted on by By admin, in Big Data | 2

Introduction :

This blog enables Hadoop users to install Hadoop on windows. As Hadoop is usually built and run on LINUX, windows installation in relatively new. The following Blog contains steps to download Hadoop and its prerequisites, install YARN based Hadoop 2.5 and above.

Prerequisites :

  • Oracle JDK versions 1.7 and 1.6 have been tested by the Hadoop developers and are known to work.
  • Make sure that JAVA_HOME is set in your environment and does not contain any spaces. If your default Java installation directory has spaces then you must use the Windows 8.3 Path name instead e.g. c:\Progra~1\Java\… instead of c:\Program Files\Java\….

Downloading Hadoop sources :

Build and Copy Binary Packages :

  • Command to install binary package directly from command prompt  “mvn package -Pdist,native-win -DskipTests -Dtar”.

Installation :

  • Pick Target Directory for installation. Here target directory used is c:\Hadoop, and Extract the tar.gz file (e.g.hadoop-2.5.0.tar.gz) under c:\Hadoop.

hadoop_1

After installing the folder structure would look like this in command prompt.

Starting a Single Node (pseudo-distributed) Cluster

Example HDFS Configuration

Before you can start the Hadoop Daemons you will need to make a few edits to configuration files. The configuration file templates will all be found in c:\Hadoop\etc\hadoop, assuming your installation directory is c:\Hadoop.

First edit the file hadoop-env.cmd to add the following lines near the end of the file.

hadoop_2

Edit or create the file core-site.xml and make sure it has the following configuration key:

hadoop_3

Edit or create the file hdfs-site.xml and add the following configuration key:

hadoop_4

Finally, edit or create the file slaves and make sure it has the following entry :– localhost

The default configuration puts the HDFS metadata and data files under \tmp on the current drive. In the above example this would be c:\tmp. For your first test setup you can just leave it at the default.

Example YARN Configuration :

Edit or create mapred-site.xml under %HADOOP_PREFIX%\etc\hadoop and add the following entries, replacing %USERNAME% with your Windows user name.

hadoop_5

Finally, edit or create yarn-site.xml and add the following entries:

hadoop_6

Initialize Environment Variables

Run c:\Hadoop\etc\hadoop\hadoop-env.cmd to setup environment variables that will be used by the startup scripts and the daemons.

Format the filesystem with the following command:

  • %HADOOP_PREFIX%\bin\hdfs namenode -format

Start HDFS Daemons

Run the following command to start the NameNode and DataNode on localhost.

  • %HADOOP_PREFIX%\sbin\start-dfs.cmd

Start YARN Daemons :

  • %HADOOP_PREFIX%\sbin\start-yarn.cmd

 


Courtesy :

https://wiki.apache.org

http://hadoop.apache.org/

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Great blog.. Installation procedure are very clear and step by step so easy to understand..

After reading this blog i very strong in this topics and this blog really helpful to all… explanation are very clear so very easy to understand… thanks a lot for sharing this blog