Using cStor from OpenEBS

Prereqs

Most of the instructions are from cStor User Guide - install and setup. First, on all my nodes I installed the iscsi tools:

sudo apt update
sudo apt install open-iscsi
sudo systemctl enable --now iscsid

Installing and configuring OpenEBS

Then I install the operator:

kubectl apply -f https://openebs.github.io/charts/cstor-operator.yaml

After sometime you should see the pods deployed:

> k get pods -n openebs
NAME                                                              READY   STATUS    RESTARTS      AGE
cspc-operator-64c67c894c-wld8h                                    1/1     Running   0             111m
cvc-operator-5697fb984f-vqtqr                                     1/1     Running   0             111m
openebs-cstor-admission-server-78898d4d6-bc4kn                    1/1     Running   0             111m
openebs-cstor-csi-controller-0                                    6/6     Running   0             111m
openebs-cstor-csi-node-fw27r                                      2/2     Running   0             111m
openebs-cstor-csi-node-pjmw7                                      2/2     Running   0             111m
openebs-cstor-csi-node-sfgbv                                      2/2     Running   2 (86m ago)   111m
openebs-ndm-bdfqh                                                 1/1     Running   0             111m
openebs-ndm-cluster-exporter-b5f8f4745-64r2c                      1/1     Running   0             111m
openebs-ndm-node-exporter-4pz4b                                   1/1     Running   0             111m
openebs-ndm-node-exporter-7pfzt                                   1/1     Running   0             111m
openebs-ndm-node-exporter-8w5kk                                   1/1     Running   1 (86m ago)   111m
openebs-ndm-operator-7769f77f8b-pjdlt                             1/1     Running   0             111m
openebs-ndm-s5c4n                                                 1/1     Running   0             111m
openebs-ndm-xcbtq                                                 1/1     Running   0             70m

And checking out the block devices I did see mine:

> kubectl get bd -n openebs
NAME                                           NODENAME   SIZE             CLAIMSTATE   STATUS   AGE
blockdevice-029310430ed3a3c5d9f99f3b66f2cf57   na         236222135808     Unclaimed    Active   71m

Checking out the logs of the disk manager, I saw the following:

> k logs -n openebs $(k get pods -n openebs -l openebs.io/component-name=ndm --field-selector spec.nodeName=na -o name)
I1221 01:03:48.427471       9 addhandler.go:105] checking if device: /dev/sda1 can be uniquely identified
I1221 01:03:48.427476       9 uuid.go:72] device(/dev/sda1) is a partition, using partition UUID: 461f6b49-d051-4d50-972e-2f3a35773a25
I1221 01:03:48.427495       9 uuid.go:91] generated uuid: blockdevice-029310430ed3a3c5d9f99f3b66f2cf57 for device: /dev/sda1
I1221 01:03:48.427504       9 addhandler.go:129] uuid: blockdevice-029310430ed3a3c5d9f99f3b66f2cf57 has been generated for device: /dev/sda1
I1221 01:03:48.427508       9 addhandler.go:46] device: /dev/sda1 already exists in cache, the event was likely generated by a partition table re-read or a change in some of the devices was detected
I1221 01:03:48.430469       9 blockdevicestore.go:145] Got blockdevice object : blockdevice-029310430ed3a3c5d9f99f3b66f2cf57
I1221 01:03:48.430488       9 addhandler.go:234] creating resource for device: /dev/sda1 with uuid: blockdevice-029310430ed3a3c5d9f99f3b66f2cf57
I1221 01:03:48.434787       9 blockdevicestore.go:112] eventcode=ndm.blockdevice.update.success msg=Updated blockdevice object rname=blockdevice-029310430ed3a3c5d9f99f3b66f2cf57

Initially I was getting the following error during the pool creation:

> k apply -f cspc.yaml
Error from server (BadRequest): error when creating "cspc.yaml": admission webhook "admission-webhook.cstor.openebs.io" denied the request: invalid cspc specification: invalid pool spec: block device has file system {ext4}

I used to have an old LVM volume on that disk, and it looks like it was able to detect it, so I cleaned it up with the following commands:

> sudo lvchange -an /dev/vg01/lv001
> sudo lvremove /dev/vg01/lv001
> sudo vgchange -an vg01
> sudo vgremove vg01
> sudo pvremove /dev/sda

I also wiped the disk as well:

> sudo wipefs -af /dev/sda
> sudo shred -vzn 1 /dev/sda

Then I restarted the pod and new device was created and the pool creation was successful:

> k delete pod -n openebs $(k get pods -n openebs -l openebs.io/component-name=ndm --field-selector spec.nodeName=na -o name)
> k get cspc -n openebs
NAME              HEALTHYINSTANCES   PROVISIONEDINSTANCES   DESIREDINSTANCES   AGE
cstor-disk-pool   1                  1                      1                  81m

You can also make sure the instance is up and healthy:

> k get cspi -n openebs
NAME                   HOSTNAME   FREE   CAPACITY     READONLY   PROVISIONEDREPLICAS   HEALTHYREPLICAS   STATUS   AGE
cstor-disk-pool-hjf2   na         211G   211010800k   false      1                     1                 ONLINE   82m

At this point some more pods will be created, which take care of the filesystem:

> k get pod -n openebs -l openebs.io/cstor-pool-cluster=cstor-disk-pool
NAME                                    READY   STATUS    RESTARTS   AGE
cstor-disk-pool-hjf2-58fc768884-nmmfv   3/3     Running   0          85m

Using the cStor Disk Pool

Then after creating a PVC, you will see a new zvol created inside cstor-disk-pool pod:

> k logs -n openebs $(k get pod -n openebs -l openebs.io/cstor-pool-cluster=cstor-disk-pool -o name) -c cstor-pool
Disabling dumping core
sleeping for 2 sec
2021-12-21/01:10:31.043 disabled auto import (reading of zpool.cache)
physmem = 2038185 pages (7.78 GB)
2021-12-21/01:15:46.883 zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 status change: DEGRADED -> DEGRADED
2021-12-21/01:15:46.883 zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 rebuild status change: INIT -> INIT
2021-12-21/01:15:46.883 Instantiating zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9bd32c9c-0d50-4606-a76e-4bfab76a8c3f
...
2021-12-21/01:29:39.012 [tgt 10.96.174.21:6060:11]: Connected
2021-12-21/01:29:39.013 [tgt 10.96.174.21:6060:11]: Handshake command for zvol pvc-9e6ab37a-5148-42fa-9395-22098bcb2722
2021-12-21/01:29:39.013 Volume:cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 has zvol_guid:2765751481259197402
2021-12-21/01:29:39.013 IO sequence number:0 Degraded IO sequence number:0
2021-12-21/01:29:39.013 New data connection on fd 18
2021-12-21/01:29:39.014 ERROR fail on unavailable snapshot pvc-9e6ab37a-5148-42fa-9395-22098bcb2722@rebuild_snap
2021-12-21/01:29:39.014 Quorum is on, and rep factor 1
2021-12-21/01:29:39.014 zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 rebuild status change: INIT -> DONE
2021-12-21/01:29:39.014 zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 status change: DEGRADED -> HEALTHY
2021-12-21/01:29:39.014 Started ack sender for zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 fd: 18
2021-12-21/01:29:39.014 Data connection associated with zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 fd: 18
2021-12-21/01:29:40.023 [tgt 10.96.174.21:6060:11]: Replica status command for zvol pvc-9e6ab37a-5148-42fa-9395-22098bcb2722
2021-12-21/01:31:04.814 Waiting for refcount (2) to go down to zero on zvol:cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9bd32c9c-0d50-4606-a76e-4bfab76a8c3f
2021-12-21/01:31:04.814 Data connection for zvol cstor-f19b9fc2-5fb5-4fd8-886b-097195b45e67/pvc-9bd32c9c-0d50-4606-a76e-4bfab76a8c3f closed on fd: 17

After you deploy a pod or deployment to use the PVC it will then create a pvc target pod (which will run the iscsi target for the disk):

> k get pods -n openebs -l openebs.io/target=cstor-target
NAME                                                              READY   STATUS    RESTARTS      AGE
pvc-9e6ab37a-5148-42fa-9395-22098bcb2722-target-6d4d866b7bbs5cx   3/3     Running   1 (71m ago)   71m

You will also see a volume created:

> k get cv -n openebs
NAME                                       CAPACITY   STATUS    AGE
pvc-9e6ab37a-5148-42fa-9395-22098bcb2722   1Gi        Healthy   85m

and a replica as well:

> k get cvr -n openebs
NAME                                                            ALLOCATED   USED    STATUS    AGE
pvc-9e6ab37a-5148-42fa-9395-22098bcb2722-cstor-disk-pool-hjf2   11.4M       33.0M   Healthy   85m

And checking out the logs of the iSCSI target/PVC pod you can see the connection made from the host:

> k logs -n openebs $(k get pods -n openebs -l openebs.io/target=cstor-target -o name) -c cstor-istgt
 * Starting enhanced syslogd rsyslogd
   ...done.
Disabling dumping core
2021-12-21/01:29:29.876 main              :3117: m#-177915968.7      : istgt:0.5.20121028:14:37:28:Sep 17 2021: starting
..
..
2021-12-21/01:35:33.677 worker            :5946: c#3.139805111351040.: con:3/25 [6b01a8c0:14612->10.96.174.21:3260,1]
2021-12-21/01:35:33.684 istgt_iscsi_op_log:2382: c#3.139805111351040.: Login from iqn.1993-08.org.debian:01:4efdaa48c143 (192.168.1.107) on iqn.2016-09.com.openebs.cstor:pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 LU1 (10.96.174.21:3260,1), ISID=23d000003, TSIH=2, CID=0, HeaderDigest=off, DataDigest=off

2021-12-21/01:35:33.862 istgt_lu_disk_scsi:2521: mt#1.139805136484096: c#3 Vendor specific INQUIRY VPD page 0xc9

2021-12-21/01:35:34.103 istgt_remove_conn :7130: c#2.139805086185216.: remove_conn->initiator:192.168.1.107(iqn.1993-08.org.debian:01:4efdaa48c143) Target: 10.244.50.236(iqn.2016-09.com.openebs.cstor:pvc-9e6ab37a-5148-42fa-9395-22098bcb2722 LU1) conn:0x7f26f382f000:0 tsih:1 connections:0  IOPending=0
2021-12-21/01:35:38.675 istgt_remove_conn :7130: c#1.139805102958336.: remove_conn->initiator:192.168.1.107(iqn.1993-08.org.debian:01:4efdaa48c143) Target: 10.244.50.236(dummy LU0) conn:0x7f26f382b000:0 tsih:2 connections:0  IOPending=0

If you go directly on the node you will see the connection:

> sudo iscsiadm --mode node
10.96.174.21:3260,1 iqn.2016-09.com.openebs.cstor:pvc-9e6ab37a-5148-42fa-9395-22098bcb2722

You will also notice it’s connecting to the internal service, checking out the services I did see the one that is exposing our target pod:

> k get svc -n openebs -l openebs.io/cas-type=cstor
NAME                                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                               AGE
pvc-9e6ab37a-5148-42fa-9395-22098bcb2722   ClusterIP   10.96.174.21   <none>        3260/TCP,7777/TCP,6060/TCP,9500/TCP   76m

Pretty nifty, also on the node you will see the disk presented at the OS level and mounted to the pod:

> sudo fdisk -l /dev/sdc
Disk /dev/sdc: 1 GiB, 1073741824 bytes, 2097152 sectors
Disk model: iscsi
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 32768 bytes / 1048576 bytes
> sudo lsblk /dev/sdc
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdc    8:32   0   1G  0 disk /var/lib/kubelet/pods/cdb5f2f5-559e-4d53-98ef-777897259986/volumes/kubernetes.io~csi/pvc

Architecture Diagram from OpenEBS

Btw there is a nice picture of how it all works together from cStor Overview:

and here is an overview of how all the parts work together:

Now if I have a container that writes a bunch to an sqlite database I won’t use the NFS provider and have terrible performance: