How to install Hadoop on Ubuntu in Stand-Alone Mode
Before directly jumping to know about “how to install Hadoop on ubuntu in stand-alone mode”, let’s first know about Introduction about Hadoop and typed of installation mode.
Introduction to Hadoop
- Is an open-source software used for handling and processing in reliable, scalable, distributive nature.
- Is a framework that allows for the distributed processing of large dataset in size of Exabyte – petabyte datasets across clusters of commodity servers using simple programming models.
- Builds on the concept of easy to scale up from single servers to multiple servers, each offering local computation and storage.
The different mode of Hadoop Installation
In this article, we are going to learn about how to install Hadoop on Ubuntu – standalone mode. but there are 3 different types of installation mode used for Apache Hadoop as below.
- Local Mode or Stand Alone mode – Used for learning and testing purpose.
- Pseudo-distributed Mode – Single Node server where name node and data node resides on the same machine.
- Fully- Distributive Mode – Multi-Node Installation used in Production
- Prerequisites for Hadoop Installation in Standalone mode
Make sure, you [non-root user] must have sudo
- Update package list on Ubuntu
First update package list on Ubuntu
$ sudo apt-get update
- Install JAVA
Hadoop is a JAVA-based open source framework, so first, we need to install JAVA on the system.
$ sudo apt-get install default-jdk $ java -version
Note:- java -version command will show details about installed java version.
- Install Apache Hadoop
First, go to Hadoop release URL and select Hadoop version. I am going to install Hadoop Version 2.9.1
Click on “binary” under “Tarball” section. You can click on this link directly
Note:- If you want to use wget command in ubuntu terminal then you can use directly
$ wget http://www-us.apache.org/dist/hadoop/common/hadoop-2.9.1/hadoop-2.9.1.tar.gz
- Download SHA-256 file
Once you will complete step 4, then click on “Checksum file” in front of version “2.9.1” which we downloaded in step 4.
Note:- If you want to use wget command in ubuntu terminal, then you can use directly.
$ wget https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.9.1/hadoop-2.9.1.tar.gz.mds
- Verify the SHA-256 file
For Verification, run below command
$ shasum -a 256 hadoop-2.9.1.tar.gz
And then check and verify output with same in the SHA-256 file
$ cat hadoop-2.9.1.tar.gz.mds
If the output of both commands is same, It means you have the correct file.
- Extract Hadoop tar file
Now you need to extract Hadoop tar file which you downloaded into step 4.
$ tar –xzvf hadoop-2.9.1.tar.gz
- Move extract to Local directory
$ sudo mv hadoop-2.9.1 /usr/local/hadoop
Here I am moving Hadoop extracted directory to /usr/local directrory inside “hadoop” directory because it will be easy to identify.
- Set JAVA_HOME location for Hadoop
Open “hadoop-env.sh” file inside “etc” location
$ sudo vim /usr/local/Hadoop/etc/hadoop/hadoop-env.sh
And add JAVA_HOME
just below of #export JAVA_HOME in “hadoop-env.sh” file.
- Save and Exit from hadoop-env.sh file.
- Run Hadoop
Now we are ready to start hadoop.
$ /usr/local/hadoop/bin/hadoop version
This command will show Hadoop version.
In this tutorial, we learn how to install Hadoop on ubuntu in stand alone mode. Please comment down about your experience with this article and also let me know in case of any improvement required. Please check my website for upcoming tutorials belong to big data and Hadoop.