The article deals with integrating the concept of LVM: Logical Volume Management with the Hadoop cluster. LVM is a concept of providing limited storage by managing optimally the storage attached to the cluster.

What is Hadoop Cluster?

A Hadoop cluster is a collection of computers, known as nodes, that are networked together to perform these kinds of parallel computations on big data sets.

Storage in Hadoop?

Is it possible to increase or distribute the size of storage of the Datanode to the Hadoop Cluster ????????

What is LVM?

LVM is a tool for logical volume management which includes allocating disks, striping, mirroring, and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.

The physical volumes are combined into logical volumes, with the exception of the /boot partition. The /boot the partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/) partition is on a logical volume, create a separate /boot partition which is not a part of a volume group.

Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.

The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3. When "partitions" reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.

Creating LVM

Step 1: Create and attach external storage.

Step 2: Convert the storage into physical volume. Since physical volume can only contribute to the volume group, it's necessary to convert the storage into PV.

Step 3: Create a volume group and add the previously created PV into it.

Step 4: This volume group is the new storage. To use any storage we need to format it, create partitions in it, and mount it.

These partitions created are known as Logical volumes or lv. Unlike static partitions, it is possible to create as many lv’s as one wants, keeping in mind the storage capacity. This lvm can be managed to increase or decrease storage. Hence it is termed as Logical Volume Management.

Integrating LVM with Hadoop!

Use fdisk -l command to check the hard disk.

Converting into a physical volume

Convert the attached storage into physical by using

pvcreate /dev/sbb

pvcreate /dev/sbc

You can check the pv created using the pvdisplay command. You can see that it is not allocated yet. For the same create, a volume group.

Create a volume

Use the command vgcreate name name_of_harddisk

Now, you can see the already created physical volumes are allocated to the vg.

Creating partitions :

Create the lv by the command lvcreate — size <value> — name lvname vgname

To use this lv we need to format it using the mkfs command . We will mount this formatted lv to the folder which is created in datanode to share the storage .

Formatting and mounting

mkfs.ext4 /dev/vgname/lvname

Now in the data node, you can change the value of the folder contributing storage. The data node now contributes to the storage created by Lvm.

The storage can be checked by running Hadoop services on the data node.

hadoop dfsadmin -report

Increasing the size of storage

lvextend --size <value> /dev/vgname/lvname

Logical Volume has resized successfully so you have to just mount it with /datanode folder using the below command.

resize2fs /dev/vgname/lvname

Are you reading ? Cause I am writing :)