Replies: 3 comments
-
|
Adding a concrete production use case where exactly this gap is biting us. We run a self-hosted K8s cluster on Talos v1.10.6 (6 bare-metal worker nodes, each with 3× 8TB NVMe SSD), with Rook-Ceph providing all storage today. Our pain point: PostgreSQL (CloudNativePG) WAL fsync latency on Ceph RBD averages 30-100ms (we see peaks of 300ms on hot volumes). On a busy production cluster doing ~3000 commits/s, this is the dominant latency. The same NVMe, accessed locally, does <1ms fsync. Identical hardware - the storage layer is the entire difference. What we want isn't "drop Ceph for LVM" - it's a mixed-tier architecture, which I suspect is fairly common at the "outgrew Ceph-only, still need RWX" scale:
What's keeping us hesitant about csi-driver-lvm / TopoLVM today:
Strongly +1 on landing this. Happy to test against a real production migration once anything is in master. |
Beta Was this translation helpful? Give feedback.
-
|
+1 as well to this. I've struggled to set up a small amount of local storage via TopoLVM using |
Beta Was this translation helpful? Give feedback.
-
|
I believe this is related to #12309 and #13529 and therefore seems to be resolved..? 🥳 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Existing Issue: #12309
We've had some discussion regarding adding LVM capabilities in talos, here
Addressing the first question would be do we even need to introduce LVM in talos.
To be clear Volumes in talos can be defined as a
disk/partition/director/overlay mount, talos handles static provisioning really well, for any k8s use case we can use UserVolumes.Their are few scenarios where the need for LVM arises:
We can have LVM in talos using any csi drivers like csi-driver-lvm or topolvm
But there are multiple limitations which needs to be addressed while using these, topolvm-limitations
I tried using
csi-driver-lvmin my initial attempts.csi-driver-lvmwas installed without restricting it to worker nodes. The driver attemptedpvcreateon the Talos system disk, CSI drivers will destroy the OS disk if allowed.failed to mkdir "/etc/lvm/backup": read-only file system, as the CSI driver assumed/etc/lvmwas writable.lvm.hostWritePath=/var/lib/lvm, after which the driver started successfully and LVM provisioning worked as expected.Later setting up a pod and pvc with proper storage class worked fine:
IMHO, this way works fine, we should work on LVM support in Talos if it simplifies the user experience and helps avoid the following issues:
If these points make a strong case for adding LVM support in Talos, we can further discuss relevant use cases and scenarios, and how such support could improve usability , as done in common config
Beta Was this translation helpful? Give feedback.
All reactions