• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here:Home » Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

By Abhishek Ghosh January 21, 2017 6:46 am Updated on January 21, 2017

Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

Advertisement

Previously, we talked about Apache Hadoop Framework. Here is How Install Apache Hadoop on Ubuntu on Single Cloud Server Instance in Stand-Alone Mode With Minimum System Requirement and Commands. Apache Hadoop is designed to run on standard dedicated hardware that provides the best balance of performance and economy for a given workload.

 

Where I Will Install Apache Hadoop?

 

For cluster, 2 quad core, hexacore upwards CPUs running at least 2GHz with 64GB of RAM is expected. We are installing as Single Node Cluster. Minimum 6-8 RAM on virtual instance is practical. You can try VPSDime 6GB OpenVZ instance at $7/month. Now, Hadoop is written in Java and OpenVZ is not exactly great for running Java applications. The host can kick out you if you whip their machine to have higher load average. If you want VMWare, then Aruba Cloud is cost effective and great. You can do testing, learning work on OpenVZ but it is not practical to run high load work with OpenVZ.

 

Steps To Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

 

We will install a single-node Hadoop cluster on Ubuntu 16.04 LTS. First prepare :

Advertisement

---

Vim
1
2
3
4
cd ~
apt update
apt upgrade
apt install default-jdk

jdk or OpenJDK is the default Java Development Kit on Ubuntu 16.04. Now check the java version :

Vim
1
java -version

Sample output :

Vim
1
2
3
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~16.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

We will create a group named hadoop and add a user named hduser :

Vim
1
2
sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser

Next we will install extra softwares, use SSH as hduser, generate key, setup password less SSH for hduser on localhost :

Vim
1
2
3
4
5
6
7
apt install ssh rsync
su hduser
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
su k
sudo adduser hduser sudo

Here are releases of Apache Hadoop :

Vim
1
2
http://hadoop.apache.org/releases.html
https://dist.apache.org/repos/dist/release/hadoop/common/

Apache Hadoop 2.7.3 is the latest stable at the time of publishing this guide. We will do these steps :

Vim
1
2
3
4
5
6
wget https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar xvzf hadoop*
rm hadoop-2.7.3.tar.gz
cd hadoop-2.7.3
sudo mv * /usr/local/hadoop
sudo chown -R hduser:hadoop /usr/local/hadoop

/usr/bin/java is a symlink to /etc/alternatives/java which is a symlink to default Java binary. We need the correct value for JAVA_HOME :

Vim
1
readlink -f /usr/bin/java | sed "s:bin/java::"

If the output is :

Vim
1
/usr/lib/jvm/java-8-openjdk-amd64/jre/

then we should open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

and adjust :

/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Vim
1
2
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/

Now if we run :

Vim
1
/usr/local/hadoop/bin/hadoop

We will get output like :

Vim
1
2
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  CLASSNAME            run the class named CLASSNAME

Up to This Step is Minimum, Basic Apache Hadoop on Ubuntu on Single Cloud Server Instance Setup. It means Hadoop is ready to be configured.

 

Configuring Apache Hadoop

 

We need to modify the following files to get a complete Apache Hadoop setup:

Vim
1
2
3
4
5
~/.bashrc
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/hdfs-site.xml

Run :

Vim
1
2
update-alternatives --config java
nano ~/.bashrc

Add these :

~/.bashrc
Vim
1
2
3
4
5
6
7
8
9
10
11
12
#HADOOP START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP END

Save the file. Run :

Vim
1
2
3
javac -version
which javac
readlink -f /usr/bin/javac

Note the values. /usr/bin/javac is from output of which javac command. Run :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Modify :

/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Vim
1
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

The above is from previous outputs. Do not blindly copy-paste. Save the file. Now do these :

Vim
1
2
mkdir -p /app/hadoop/tmp
sudo chown hduser:hadoop /app/hadoop/tmp

Open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/core-site.xml

Modify :

/usr/local/hadoop/etc/hadoop/core-site.xml
Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/app/hadoop/tmp</value>
  <description>A base for other temporary directories.</description>
</property>
 
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

Run :

Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
nano /usr/local/hadoop/etc/hadoop/mapred-site.xml
<pre>
 
 
Modify :
 
<pre title="/usr/local/hadoop/etc/hadoop/mapred-site.xml">
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>
</configuration>

Run :

Vim
1
2
3
mkdir -p /usr/local/hadoop_store/hdfs/namenode
mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

Modify :

/usr/local/hadoop/etc/hadoop/hdfs-site.xml
Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<configuration>
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>
<property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

Try to run :

Vim
1
2
cd ~
hadoop namenode -format

Above command must be executed before we start using Hadoop. Basically the commands are for real physical server. You can read this guide :

Vim
1
https://wiki.apache.org/hadoop/Virtual%20Hadoop

The last command can fail for a given host-virtualisation technology. For that reason, in last step we will show how to use the bundled MapReduce program. If the above fails, you can use in that way. As you are new user with limited budget, we tried to emulate physical servers for learning plus offer a universal working example.

Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

Now, we can use as from as fresh SSH :

Vim
1
2
3
sudo su hduser
cd /usr/local/hadoop/sbin && ls
start-all.sh

Actually on localhost you can browse to :

Vim
1
http://localhost:50070/

You need to adjust the localhost to fully qualified domain name to really see. We have successfully configured Hadoop to run in stand-alone mode. We will run the example MapReduce program. Run :

Vim
1
2
3
mkdir ~/input
cp /usr/local/hadoop/etc/hadoop/*.xml ~/input
/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep ~/input ~/grep_example 'principal[.]*'

More not possible to write on this guide, you may read here :

Vim
1
https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Tagged With download apache hadoop , hadoop ubuntu , install apache hadoop in ubuntu , install apache hadoop ubuntu , install hadoop on ubuntu standalone , mv server to server hadoop , paperuri:(e6e446524564c08f94460d43ef725edb) , ubuntu install hadoop , vim ~/ bashrc export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

  • Installing Local Data Lake on Ubuntu Server : Part 1

    Here is Part 1 of Installing Local Data Lake on Ubuntu Server With Hadoop, Spark, Thriftserver, Jupyter etc To Build a Prediction System.

  • How To Install Apache HBase : Ubuntu Single Cloud Server Instance

    Here is Step By Step Guide On How To Install Apache HBase On Ubuntu Single Cloud Server Instance. Hbase is column-oriented distributed datastore,

  • How to Install Apache Accumulo on Ubuntu (Single Cloud Server Instance)

    Here Are the Steps on How to Install Apache Accumulo on Ubuntu Running on Single Cloud Server Instance. Accumulo provides robust, scalable data storage.

  • How to Install Apache Tika on Ubuntu Server

    Apache Tika is a Content Analysis Framework Which Can Be Configure With Web Software Like WordPress For Metadata Extraction of PDF, doc. Here is How to Install Apache Tika on Ubuntu Server.

performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Hybrid Multi-Cloud Environments Are Becoming UbiquitousJuly 12, 2023
  • Data Protection on the InternetJuly 12, 2023
  • Basics of BJT TransistorJuly 11, 2023
  • What is Confidential Computing?July 11, 2023
  • How a MOSFET WorksJuly 10, 2023
PC users can consult Corrine Chorney for Security.

Want to know more about us?

Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy