Using rook ceph
With the recent news about openEBS getting archived by CNCF OpenEBS: Lessons We Learned from Open Source, I decided this was the time to try out another storage provider and I decided to try rook-ceph. I’ve heard so much about ceph in the past but I never really tried it… so this time around I decided to give it a chance. I have a pretty small k8s cluster with just 3 VMs running on a Proxmox server so I basically did the reverse of Migrate a K3S cluster storage from Rook to OpenEBS, with Velero.
Backup applications with PVCs
In the past I used velero and it works quite well, so to be safe I backed up my applications:
velero backup create postges-backup -l app=postgres --snapshot-volumes --volume-snapshot-locations=default
velero backup create plex-backup -l app=plexserver --snapshot-volumes --volume-snapshot-locations=default
And I confirmed the backups were successful:
> velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
postges-backup Completed 0 1 2024-07-13 08:33:18 -0600 MDT 29d default app=postgres
I also made app level backups for some of the applications:
# postgres
pg_dumpall -h postgres.kar.int -U postgres > all_pg_dbs.sql
# plex
k cp plexserver-7f8c78b456-xr4hn:/config plex-config-backup
I then deleted the applications that were using those PVCs (I knew I was going to be switching over storage classes so I wanted to make sure there are no conflicting configurations or left over objects). I use argocd to manage all of my applications so I just logged into the UI and deleted them. But that’s basically the same thing as :
kubectl delete deployment my_deployment
kubectl delete pvc my_pvc
Uninstall openEBS
If you installed openebs with helm
then it’s as easy as :
helm uninstall openebs -n openebs
But if you are like me and installed it a while back or maybe just didn’t use helm
, then you can follow instructions from Uninstalling OpenEBS and since I was also using the cstor
volumes then we also have to follow instructions in cStor User Guide - Clean Up. You basically have to make sure to delete all the CRs that openEBS
uses:
spc
bdc
cvr
cstorvolume
bd
cspc
cspi
After you are done, delete the openebs
namespace and if all the CRs are gone the ns
will delete. If there are left over resources the namespace will be stuck in Terminating state and if you describe
it will give you a hint as to which resources are still around:
kubectl desribe ns openebs
After the ns
and crds
are gone you should be all set
Wiping the disks
After the uninstall I saw that the zfs
file system was left behind on the disks:
> lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sdb
└─sdb1
zfs_me 5000 cstor-62d0ad14-57c9-4571-95e3-84dec193c047
1524832950104593316
And I had to wipe it twice:
> sudo wipefs -a /dev/sdb
/dev/sdb: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 8 bytes were erased at offset 0x18fffffe00 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: calling ioctl to re-read partition table: Success
After the first one there were still some remnants of zfs
left:
> lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sdb zfs_member 5000 cstor-62d0ad14-57c9-4571-95e3-84dec193c047 10313661773127100217
Then I wiped it a second time:
> sudo wipefs -a /dev/sdb
/dev/sdb: 8 bytes were erased at offset 0x18fffbfc00 (zfs_member): 0c b1 ba 00 00 00 00 00
/dev/sdb: 8 bytes were erased at offset 0x18fffbf800 (zfs_member): 0c b1 ba 00 00 00 00 00
/dev/sdb: 8 bytes were erased at offset 0x18fffbf400 (zfs_member): 0c b1 ba 00 00 00 00 00
Then it was clean:
> lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sdb
This is a Ceph Prerequisite
Installing rook-ceph
It looks like there are instructions with manifest files and helm charts. I decided to use the helm
chart with argoCD.
Troubleshooting Install Issues
After the cephcluster
was deployed via helm
, I saw it was unhealthy:
> k get cephclusters -A
NAMESPACE NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL FSID
rook-ceph rook-ceph /var/lib/rook 3 28m Progressing Configuring Ceph Mons
Describing the cluster:
> k describe cephclusters -n rook-ceph rook-ceph
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ReconcileFailed 24m rook-ceph-cluster-controller failed to reconcile CephCluster "rook-ceph/rook-ceph". failed to reconcile cluster "rook-ceph": failed to configure local ceph cluster: failed to create cluster: failed to start ceph monitors: failed to assign pods to mons: failed to schedule mons
Looking at the pods
:
> k get po -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-rbdplugin-kpngn 2/2 Running 1 (27m ago) 27m
csi-rbdplugin-lpc27 2/2 Running 0 27m
csi-rbdplugin-provisioner-7c6dcb4dff-l6rr4 5/5 Running 2 (26m ago) 27m
csi-rbdplugin-provisioner-7c6dcb4dff-v5cp7 5/5 Running 3 (25m ago) 27m
csi-rbdplugin-tbc5c 2/2 Running 1 (28m ago) 27m
rook-ceph-crashcollector-ma-f896477f4-hmx9j 1/1 Running 0 21m
rook-ceph-crashcollector-nc-7cbcc6fc4c-89fnq 1/1 Running 0 21m
rook-ceph-crashcollector-nd-694d688894-w4f9j 1/1 Running 0 21m
rook-ceph-exporter-ma-5c9558cfc9-8nzlg 0/1 Init:0/1 0 21m
rook-ceph-exporter-nc-845cd8bcdb-nqqvn 0/1 Init:0/1 0 21m
rook-ceph-exporter-nd-74f4d4894c-69tph 0/1 Init:0/1 0 21m
rook-ceph-mon-a-67865c4b7-w7trk 2/2 Running 0 22m
rook-ceph-mon-b-79664fccb5-qtbb4 2/2 Running 0 22m
rook-ceph-mon-c-fc5cd47bb-wc72d 2/2 Running 0 22m
rook-ceph-operator-7b786cb7fd-94bvb 1/1 Running 0 11h
rook-ceph-tools-7bddb946bd-66swh 1/1 Running 0 29m
The mon
pods are actually up. I checked the logs and nothing crazy stood out. So using the ceph-tools
, pod I checked out the ceph status
:
> k exec -it -n rook-ceph rook-ceph-tools-7bddb946bd-66swh -- /bin/bash
bash-4.4$ ceph status
cluster:
id: 346c16bc-307c-4324-8f6d-5c1d782aadda
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
clock skew detected on mon.b, mon.c
mons a,b are low on available space
services:
mon: 3 daemons, quorum a,b,c (age 24m)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
I was surprised to see the low space warning, I then ran into this issue, it looks like by default it checks to make sure you have at least 30%
free space, this was just a lab setup and I didn’t have that much space, so I modified my values.yaml
and added the following:
configOverride: |
[global]
mon_data_avail_warn = 10
I also realized 3 mon
pods might be too much for my lab, so I also modified that:
cephClusterSpec:
crashCollector:
disable: true
mon:
count: 1
mgr:
count: 1
There are some nice options defined in the rook/deploy/examples /cluster-test.yaml file, if you running a non-production cluster (I think this can save some resources, if you are willing to risk an outage)
Then I deleted and reinstalled using helm
and I saw all the pods healthy:
> k get po -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-rbdplugin-2wn5l 2/2 Running 0 4h42m
csi-rbdplugin-48j9h 2/2 Running 2 (4h26m ago) 4h42m
csi-rbdplugin-824jg 2/2 Running 0 4h42m
csi-rbdplugin-provisioner-7c6dcb4dff-mn7s4 5/5 Running 0 4h22m
csi-rbdplugin-provisioner-7c6dcb4dff-v994d 5/5 Running 0 4h42m
rook-ceph-exporter-ma-5c9558cfc9-bhh45 1/1 Running 0 4h30m
rook-ceph-exporter-nc-845cd8bcdb-fl5p6 1/1 Running 0 4h22m
rook-ceph-exporter-nd-74f4d4894c-b54pk 1/1 Running 0 4h26m
rook-ceph-mgr-a-5b8f65d64b-9kpp9 2/2 Running 0 96m
rook-ceph-mon-a-7bbfff855b-fh8lg 2/2 Running 0 4h31m
rook-ceph-operator-7b786cb7fd-zp9lz 1/1 Running 0 4h26m
rook-ceph-osd-0-57cbdfc6b4-kcq2m 2/2 Running 0 4h30m
rook-ceph-osd-1-7897c855f9-xqrxx 2/2 Running 0 4h30m
rook-ceph-osd-2-56545b5ff-ptpsl 2/2 Running 0 4h26m
rook-ceph-osd-prepare-ma-2czsq 0/1 Completed 0 94m
rook-ceph-osd-prepare-nc-f8b5p 0/1 Completed 0 94m
rook-ceph-osd-prepare-nd-b6dfb 0/1 Completed 0 94m
rook-ceph-tools-7bddb946bd-nk7lv 1/1 Running 0 4h26m
As a last test I cheched to make sure the health and disk space is correct:
> k exec -it -n rook-ceph rook-ceph-tools-7bddb946bd-nk7lv -- /bin/bash
bash-4.4$ ceph health
HEALTH_OK
bash-4.4$ ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 300 GiB 294 GiB 6.5 GiB 6.5 GiB 2.16
TOTAL 300 GiB 294 GiB 6.5 GiB 6.5 GiB 2.16
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.rgw.root 1 32 0 B 0 0 B 0 93 GiB
ceph-blockpool 2 32 1.5 GiB 441 4.6 GiB 1.62 93 GiB
.mgr 3 1 577 KiB 2 1.7 MiB 0 93 GiB
Even though I have 1 mon
, I decided to still use 3 for the osd_pool_size
, that way it will replicate across multiple nodes/osd
. As expected we can see I have a total of 300G
of raw storage but only 95G
from the available spaces in my pool.
Disabling SSL on the ceph dashboard
Since it’s just an internal deployment, I didn’t really need to enable ssl
on the dashboard. Initially I just disabled it, but it still didn’t work. I ran into this old issue and it looks like if you disable SSL you have to specify the port. So I updated my values.yaml
:
dashboard:
enabled: true
ssl: false
port: 8080
urlPrefix: /
And let argoCD
reapply that and then I was able to reach the dashboard without issues. To get the default admin password we can use the instructions from Ceph Dashboard, and grab it from the secret:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
And I saw the dashboard with a healthy cluster:
Restoring the applications
I reinstalled the applications using argoCD and for the data I actually just left the velero backups in the storage bucket for now and I manually restored my postgres db:
# back everything up
pg_dumpall -h postgres.kar.int -U postgres > all_pg_dbs.sql
# restore
psql -h postgres.kar.int -U postgres -f all_pg_dbs.sql
And for my plex instance, I just backed up the Library
directory and copied it back:
# backup
k cp plexserver-7f8c78b456-xr4hn:/config plex-config-backup
# restore
k cp plex-config-backup/Library plexserver-7f8c78b456-f72zl:/config/
I will play around with the restoration from velero for the other apps next.