Configuring Hadoop Services Using Ansible Playbook

So we have been given two tasks by Vimal Sir, to automate and manage Hadoop using Ansible and efficiently using Ansible playbook to manage HTTPD Service.

Task Descriptionđź“„

đź”° 11.1 Configure Hadoop and start cluster services using Ansible Playbook

đź”° 11.3 Restarting HTTPD Service is not idempotence in nature and also consume more resources suggest a way to rectify this challenge in Ansible playbook

11.1:- Configure Hadoop and start cluster services using Ansible Playbook

Type this command in your vm it will download the ansible for you.

pip3 install ansible

Now we have to make random name file in my case i make a file named /etc/myhosts.txt and write your other virtual machine ip (vm in which you want to configure and setup the hadoop namenode and datanode) and other things like root and password etc..

Now check the ansible version by typing.

ansible --version

Acoording to above image ansible see its repository in /etc/ansible/ansible.conf file so configure this file.

See all the hosts by typing ansible all — list-hosts.

Ping to the host to see there is ssh connectivity between both the virtual machine or not.

Now I am ready with my playbook code.

- hosts: namenode
vars_files:
- var.yml
tasks:
- name: Copy Java Software
copy:
src: "/root/jdk-8u171-linux-x64.rpm"
dest: "/root/"
- name: Copy Hadoop Software
copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/"
- name: Install Java Software
shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
register: java_install
- name: java install information
debug:
var: java_install
- name: Install Hadoop Software
shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
register: hadoop_install
when: java_install.rc == 0
- name: hadoop install information
debug:
var: hadoop_install
- name: Create Directory
file:
state: directory
path: "{{ name_dir }}"
- name: Copy hdfs-site.xml file
template:
src: "n_hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Copy core-site.xml file
template:
src: "n_core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: Format the namenode directory
shell: "echo Y | hadoop namenode -format"
- name: Start Namenode Service
shell: "hadoop-daemon.sh start namenode"
- hosts: datanode
vars_files:
- var.yml
tasks:
- name: Copy Java Software
copy:
src: "/root/jdk-8u171-linux-x64.rpm"
dest: "/root/"
- name: Copy Hadoop Software
copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/"
- name: Install Java Software
shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
register: java_install
- name: java install information
debug:
var: java_install
- name: Install Hadoop Software
shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
register: hadoop_install
when: java_install.rc == 0
- name: hadoop install information
debug:
var: hadoop_install
- name: Create Directory
file:
state: directory
path: "{{ data_dir }}"
- name: Copy hdfs-site.xml file
template:
src: "d_hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Copy core-site.xml file
template:
src: "d_core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: Start Namenode Service
shell: "hadoop-daemon.sh start datanode"

And my var file where I store the variables.

name_ip: 192.168.43.102
name_port: 9001
name_dir: /nn8
data_dir: /dn8

Now check the syntax of the main playbook ansible-playbook — syntax-check hadoop.yml and after that run this playbook by typing ansible-playbook hadoop.yml. It will give the output like this.

ansible-playbook --syntax-check hadoop.yml
ansible-playbook hadoop.yml

Now I check in the Namenode virtual machine that everything is going good or not.

In the above image you can see that firstly java and hadoop is not installed and jps command is not working but after running playbook everything is configured.

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file is configured after running playbook.

Now I check in the Datanode virtual machine that everything is going good or not.

In the above image you can see that firstly java and hadoop is not installed and jps command is not working but after running playbook everything is configured.

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file is configured after running playbook.

You can check the report of hadoop claster by typing hadoop dfsadmin -report.

hadoop dfsadmin -report

Hadoop setup completed.

11.3:- Restarting HTTPD Service is not idempotence in nature and also consume more resources suggest a way to rectify this challenge in Ansible playbook

Ping to the host to see there is ssh connectivity between both the virtual machine or not.

Now I am ready with my playbook code.

---
- hosts: all
vars_files:
- var1.yml
tasks:
- name: "Create directory for dvd mount"
file:
state: directory
path: "{{ dvd_dir }}"
- name: "Mount the dvd to the directory"
mount:
src: "/dev/cdrom"
path: "{{ dvd_dir }}"
state: mounted
fstype: "iso9660"
- name: "Configure AppStream for yum"
yum_repository:
baseurl: "{{ dvd_dir }}/AppStream"
name: "dvd1"
description: "dvd1 for AppStream packages"
gpgcheck: no
- name: "Configure BaseOS for yum"
yum_repository:
baseurl: "{{ dvd_dir }}/BaseOS"
name: "dvd2"
description: "dvd2 for BaseOS packages"
gpgcheck: no
- name: "Install package"
package:
name: "httpd"
state: present
register: x
- name: "Create directory for web server"
file:
state: directory
path: "{{ doc_root }}"
register: y
- name: "Copy the configuration file"
template:
dest: "/etc/httpd/conf.d/lw.conf"
src: "lw.conf"
when: x.rc == 0
notify:
- Start service
- name: "Copy the web page"
copy:
dest: "{{ doc_root }}/index.html"
content: "this is neeew web page\n"
when: y.failed == false

- name: "start httpd service"
service:
name: "httpd"
state: started
- name: "Create firewall rule"
firewalld:
port: "{{ http_port }}/tcp"
state: enabled
permanent: yes
immediate: yes
handlers:
- name: Start service
service:
name: "httpd"
state: restarted

And my var file where I store the variables.

doc_root: "/var/www/arya"
dvd_dir: "/dvd5"
http_port: 8082

Now check the syntax of the main playbook ansible-playbook — syntax-check hadoop.yml and after that run this playbook by typing ansible-playbook hadoop.yml. It will give the output like this.

ansible-playbook --syntax-check hadoop.yml
ansible-playbook hadoop.yml

Now you can check in virtual machine whose IP is 192.168.43.131 where I want to deploy web server.

Now you can from the browser that web server is running or not.

Now If you run the playbook again then it will shows that Your service is started so no need the restart again this become possible because of the handlers and notify keyworks in ansible.

Now I change my var file where I store the variables.

doc_root: "/var/www/harsh"
dvd_dir: "/dvd5"
http_port: 8083

Now I run my playbook again with new variables.

Now you can check in virtual machine whose IP is 192.168.43.131 where I want to deploy web server.

You can check the final output from the browser and type both the port number 8082 as well as 8083.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Box-Sizing: Border-Box;

#3. Under the hood of Nevermined

Improving our agile process leveraging Jira boards and flows

The Fair Share Clause

Unreal Engine 5 will blow your mind, but tread carefully until they fix this bug…

6 amazing web development trends to be witnessed in 2018–19

Configuring HTTPD Server on Docker container

K8s DR in GCP using Velero

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arya Dhorajiya

Arya Dhorajiya

More from Medium

Installing and Configuring InnoDB Cluster

Creating a Custom Python Application Image and Deploying to Docker

Custom Python server image built with Dockerfile

Perimeter security with Fastly edge and AWS — Part I

Flask API Template