Use GlusterFS to replicate data inside LXC containers

GlusterFS
When you're working with LXC containers and using those on VPS or dedicated servers, you might have wondered have you could have HA containers and how to deal with shared data or storage. One possible solution is to use local storage on both ends and have those replicate to each other. In the following example, we are dealing with 2 LXC containers, both with storage. We'll use GlusterFS to replicate the data.

Install glusterfs

Install glusterfs on both "servers". If you want a more recent version of GlusterFS, you could follow these steps. Otherwise, a simple 'apt-get install glusterfs-server' will get you there.:

wget -O - https://download.gluster.org/pub/gluster/glusterfs/3.12/rsa.pub | apt-key add -

DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
echo "deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/apt ${DEBVER} main" > /etc/apt/sources.list.d/gluster.list

apt-get install apt-transport-https
apt-get update
apt-get install glusterfs-server

Check the version:

glusterfsd --version

Create volumes

Create the storage on both machines. In my case, I create a lvm partition for the machines. As this is a test environment with 2 VM's, the storage is created from the same volume group. In a production environment, this would be on different servers.:

lvcreate -L 2G -n gldata1 vg
Logical volume "gldata1" created

lvcreate -L 2G -n gldata2 vg
Logical volume "gldata2" created

Create a filesystems on both:

mkfs.ext3 /dev/vg/gldata1
mkfs.ext3 /dev/vg/gldata2

In this example, we first need to mount the volumes locally on the host machine and then add the volumes to the containers to appear as local storage. In a production environment this will probably be different.:

mount /dev/vg/gldata1 /srv/gldata1
mount /dev/vg/gldata2 /srv/gldata2

Create the mountpoint on both machines:

mkdir /data

Again since we are working with a host machine for the LXC containers, we will need to pass the /srv/gldata[1,2] to the containers. We make the additional logical volumes available for the containers. To do so, edit the config of the container and add this line:

lxc.mount.entry = /srv/gldata1 /var/lib/lxc/machine1/rootfs/data/ none bind 0 0

Restart the container and try to write a temp file to /data. Do this for both machines. On the 2nd machine:

lxc.mount.entry = /srv/gldata2 /var/lib/lxc/machine2/rootfs/data/ none bind 0 0

Gluster volumes

Next see if gluster can reach the other machine. Do this from the first machine:

gluster peer probe <machine2>

Check the status of the storage pool:

gluster peer status
Number of Peers: 1

Hostname: <machine2>
Uuid: 1d068a80-e113-419e-a059-20cf23f1b4ad
State: Peer in Cluster (Connected)

Create a GlusterFS volume:

gluster volume create testvol replica 2 transport tcp machine1.lan.mydomain:/data/testvol machine2.lan.mydomain:/data/testvol force

Error

volume create: testvol: failed: Staging failed on machine2.lan.mydomain. Error: Glusterfs is not supported on brick: machine2.lan.mydomain:/data/testvol.

Setting extended attributes failed, reason: Operation not permitted.

This error occured because of this line in the LXC container:

lxc.cap.drop = sys_admin

After commenting out this line and restarting the container, the command succeeded.

gluster volume create testvol replica 2 transport tcp machine1.lan.mydomain:/data/testvol machine2.lan.mydomain:/data/testvol force

volume create: testvol: success: please start the volume to access data

Start the volume:

gluster volume start testvol
volume start: testvol: success

Check the connections with netstat. On machine 1:

[root@machine1]$ netstat -tap | grep glusterfsd
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      9547/glusterfsd
tcp        0      0 machine1.lan.mydomain:49150 machine1.lan.mydomain:24007 ESTABLISHED 9547/glusterfsd
tcp        0      0 machine1.lan.mydomain:49152 machine2.lan.mydomain:49149 ESTABLISHED 9547/glusterfsd
tcp        0      0 machine1.lan.mydomain:49152 machine1.lan.mydomain:49149 ESTABLISHED 9547/glusterfsd

On machine 2:

[root@machine2]$ netstat -tap | grep glusterfsd
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2657/glusterfsd
tcp        0      0 machine2.lan.mydomain:49152 machine2.lan.mydomain:49146 ESTABLISHED 2657/glusterfsd
tcp        0      0 machine2.lan.mydomain:49150 machine2.lan.mydomain:24007 ESTABLISHED 2657/glusterfsd
tcp        0      0 machine2.lan.mydomain:49152 machine1.lan.mydomain:49146 ESTABLISHED 2657/glusterfsd

Again, on machine 1, check the status of the volume:

gluster volume info

Volume Name: testvol
Type: Replicate
Volume ID: df7531fa-2471-4aa2-bd4e-04b004ce44c7
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: machine1.lan.mydomain:/data/testvol
Brick2: machine2.lan.mydomain:/data/testvol
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

Test replication

Now we need to test the replication. To do this, we will mount the GlusterFS volume on machine 1 and write to it. Then we need to check if the files are replicated on both machines .

First, to mount the volume, on server 1:

cd /mnt
mkdir /mnt/vol
mount -t glusterfs machine1.lan.mydomain:/testvol vol

Failed with following error:

Error

I [MSGID: 100030] [glusterfsd.c:2454:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.8 (args: /usr/sbin/glusterfs --volfile-server=machine1.lan.mydomain --volfile-id=/testvol /mnt/vol)

E [mount.c:341:gf_fuse_mount] 0-glusterfs-fuse: cannot open /dev/fuse (File or directory doesn't exist)

E [MSGID: 101019] [xlator.c:433:xlator_init] 0-fuse: Initialization of volume 'fuse' failed, review your volfile again

Since we are testing this from a container, fuse probably isn't available. Change the config of the lxc container. In my case, this line was already in the config.:

...
#fuse
lxc.cgroup.devices.allow = c 10:229 rwm

After rebooting the container (if the line wasn't there), create the fuse device if it doesn't exist:

mknod /dev/fuse c 10 229

ls -la /dev/fuse

crw-r--r--  1 root root  10, 229 jan 21 10:19 fuse

Next, we try to mount the glusterfs volume again:

mount -t glusterfs machine1.lan.mydomain:/testvol vol

Now the mount command succeeds. This means we mounted the GlusterFS volume on machine 1. Finally, we can now proceed to testing the replication:

cd /mnt/vol
for i in `seq -w 1 100`; do cp -rp /var/log/messages /mnt/vol/copy-test-$i; done

To check if the replication was succesful, we need to check the mounted volume, and the volumes on both machines.

On machine 1, mounted volume:

/mnt/vol
[root@machine1]$ ls -la
Display all 101 possibilities? (y or n)
copy-test-001  copy-test-016  copy-test-031  copy-test-046  copy-test-061  copy-test-076  copy-test-091
copy-test-002  copy-test-017  copy-test-032  copy-test-047  copy-test-062  copy-test-077  copy-test-092
copy-test-003  copy-test-018  copy-test-033  copy-test-048  copy-test-063  copy-test-078  copy-test-093
copy-test-004  copy-test-019  copy-test-034  copy-test-049  copy-test-064  copy-test-079  copy-test-094
copy-test-005  copy-test-020  copy-test-035  copy-test-050  copy-test-065  copy-test-080  copy-test-095
copy-test-006  copy-test-021  copy-test-036  copy-test-051  copy-test-066  copy-test-081  copy-test-096
copy-test-007  copy-test-022  copy-test-037  copy-test-052  copy-test-067  copy-test-082  copy-test-097
copy-test-008  copy-test-023  copy-test-038  copy-test-053  copy-test-068  copy-test-083  copy-test-098
copy-test-009  copy-test-024  copy-test-039  copy-test-054  copy-test-069  copy-test-084  copy-test-099
copy-test-010  copy-test-025  copy-test-040  copy-test-055  copy-test-070  copy-test-085  copy-test-100
copy-test-011  copy-test-026  copy-test-041  copy-test-056  copy-test-071  copy-test-086  .trashcan/
copy-test-012  copy-test-027  copy-test-042  copy-test-057  copy-test-072  copy-test-087
copy-test-013  copy-test-028  copy-test-043  copy-test-058  copy-test-073  copy-test-088
copy-test-014  copy-test-029  copy-test-044  copy-test-059  copy-test-074  copy-test-089
copy-test-015  copy-test-030  copy-test-045  copy-test-060  copy-test-075  copy-test-090

The volume on machine 1:

[root@machine1]$ ls -la /data/testvol/
Display all 102 possibilities? (y or n)
copy-test-001  copy-test-016  copy-test-031  copy-test-046  copy-test-061  copy-test-076  copy-test-091
copy-test-002  copy-test-017  copy-test-032  copy-test-047  copy-test-062  copy-test-077  copy-test-092
copy-test-003  copy-test-018  copy-test-033  copy-test-048  copy-test-063  copy-test-078  copy-test-093
copy-test-004  copy-test-019  copy-test-034  copy-test-049  copy-test-064  copy-test-079  copy-test-094
copy-test-005  copy-test-020  copy-test-035  copy-test-050  copy-test-065  copy-test-080  copy-test-095
copy-test-006  copy-test-021  copy-test-036  copy-test-051  copy-test-066  copy-test-081  copy-test-096
copy-test-007  copy-test-022  copy-test-037  copy-test-052  copy-test-067  copy-test-082  copy-test-097
copy-test-008  copy-test-023  copy-test-038  copy-test-053  copy-test-068  copy-test-083  copy-test-098
copy-test-009  copy-test-024  copy-test-039  copy-test-054  copy-test-069  copy-test-084  copy-test-099
copy-test-010  copy-test-025  copy-test-040  copy-test-055  copy-test-070  copy-test-085  copy-test-100
copy-test-011  copy-test-026  copy-test-041  copy-test-056  copy-test-071  copy-test-086  .glusterfs/
copy-test-012  copy-test-027  copy-test-042  copy-test-057  copy-test-072  copy-test-087  .trashcan/
copy-test-013  copy-test-028  copy-test-043  copy-test-058  copy-test-073  copy-test-088
copy-test-014  copy-test-029  copy-test-044  copy-test-059  copy-test-074  copy-test-089
copy-test-015  copy-test-030  copy-test-045  copy-test-060  copy-test-075  copy-test-090

The volume on machine 2:

[root@machine2]$ ls -la /data/testvol/
Display all 102 possibilities? (y or n)
copy-test-001  copy-test-016  copy-test-031  copy-test-046  copy-test-061  copy-test-076  copy-test-091
copy-test-002  copy-test-017  copy-test-032  copy-test-047  copy-test-062  copy-test-077  copy-test-092
copy-test-003  copy-test-018  copy-test-033  copy-test-048  copy-test-063  copy-test-078  copy-test-093
copy-test-004  copy-test-019  copy-test-034  copy-test-049  copy-test-064  copy-test-079  copy-test-094
copy-test-005  copy-test-020  copy-test-035  copy-test-050  copy-test-065  copy-test-080  copy-test-095
copy-test-006  copy-test-021  copy-test-036  copy-test-051  copy-test-066  copy-test-081  copy-test-096
copy-test-007  copy-test-022  copy-test-037  copy-test-052  copy-test-067  copy-test-082  copy-test-097
copy-test-008  copy-test-023  copy-test-038  copy-test-053  copy-test-068  copy-test-083  copy-test-098
copy-test-009  copy-test-024  copy-test-039  copy-test-054  copy-test-069  copy-test-084  copy-test-099
copy-test-010  copy-test-025  copy-test-040  copy-test-055  copy-test-070  copy-test-085  copy-test-100
copy-test-011  copy-test-026  copy-test-041  copy-test-056  copy-test-071  copy-test-086  .glusterfs/
copy-test-012  copy-test-027  copy-test-042  copy-test-057  copy-test-072  copy-test-087  .trashcan/
copy-test-013  copy-test-028  copy-test-043  copy-test-058  copy-test-073  copy-test-088
copy-test-014  copy-test-029  copy-test-044  copy-test-059  copy-test-074  copy-test-089
copy-test-015  copy-test-030  copy-test-045  copy-test-060  copy-test-075  copy-test-090

Seems like the replication works fa-thumbs-up fa-2x green

Tip

Do not attempt to write to the volume in /data/testvol on the machines, that won't work. You need to write to the mounted volume.

Access permissions

It's possible to set access permissions on the volume. For instance if we want to allow a certain IP to access the volume:

gluster volume set testvol auth.allow x.y.z.w