Introduction :
This blog enables Hadoop users to install Hadoop on windows. As Hadoop is usually built and run on LINUX, windows installation in relatively new. The following Blog contains steps to download Hadoop and its prerequisites, install YARN based Hadoop 2.5 and above.
Prerequisites :
- Oracle JDK versions 1.7 and 1.6 have been tested by the Hadoop developers and are known to work.
- Make sure that JAVA_HOME is set in your environment and does not contain any spaces. If your default Java installation directory has spaces then you must use the Windows 8.3 Path name instead e.g. c:\Progra~1\Java\… instead of c:\Program Files\Java\….
Downloading Hadoop sources :
-
From the ASF Hadoop download page or a mirror.
-
Subversion URL: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.5
Build and Copy Binary Packages :
- Command to install binary package directly from command prompt “mvn package -Pdist,native-win -DskipTests -Dtar”.
Installation :
- Pick Target Directory for installation. Here target directory used is c:\Hadoop, and Extract the tar.gz file (e.g.hadoop-2.5.0.tar.gz) under c:\Hadoop.
After installing the folder structure would look like this in command prompt.
Starting a Single Node (pseudo-distributed) Cluster
Example HDFS Configuration
Before you can start the Hadoop Daemons you will need to make a few edits to configuration files. The configuration file templates will all be found in c:\Hadoop\etc\hadoop, assuming your installation directory is c:\Hadoop.
First edit the file hadoop-env.cmd to add the following lines near the end of the file.
Edit or create the file core-site.xml and make sure it has the following configuration key:
Edit or create the file hdfs-site.xml and add the following configuration key:
Finally, edit or create the file slaves and make sure it has the following entry :– localhost
The default configuration puts the HDFS metadata and data files under \tmp on the current drive. In the above example this would be c:\tmp. For your first test setup you can just leave it at the default.
Example YARN Configuration :
Edit or create mapred-site.xml under %HADOOP_PREFIX%\etc\hadoop and add the following entries, replacing %USERNAME% with your Windows user name.
Finally, edit or create yarn-site.xml and add the following entries:
Initialize Environment Variables
Run c:\Hadoop\etc\hadoop\hadoop-env.cmd to setup environment variables that will be used by the startup scripts and the daemons.
Format the filesystem with the following command:
- %HADOOP_PREFIX%\bin\hdfs namenode -format
Start HDFS Daemons
Run the following command to start the NameNode and DataNode on localhost.
- %HADOOP_PREFIX%\sbin\start-dfs.cmd
Start YARN Daemons :
- %HADOOP_PREFIX%\sbin\start-yarn.cmd
Courtesy :
https://wiki.apache.org
http://hadoop.apache.org/
Best Open Source Business Intelligence Software Helical Insight is Here
Great blog.. Installation procedure are very clear and step by step so easy to understand..
After reading this blog i very strong in this topics and this blog really helpful to all… explanation are very clear so very easy to understand… thanks a lot for sharing this blog