Description | sds-elastic

Available in: EE

The module lifecycle stage: Experimental
The module has requirements for installation

The module is in Experimental stage. The API, configuration, and custom resources may change without notice; do not use it for production workloads.

The sds-elastic module deploys and manages Rook Ceph in a Deckhouse Kubernetes Platform cluster, turning a set of nodes into a distributed Ceph-backed storage system. The module provisions block volumes (RBD) and shared filesystems (CephFS) backed by csi-ceph StorageClasses, without manual Rook deployment.

Management is split across three custom resources from the storage.deckhouse.io/v1alpha1 API group:

ElasticCluster (ec) — declares the desired Ceph cluster: which nodes participate (storage.nodeSelector), which BlockDevice CRs back the OSDs (storage.blockDeviceSelector) and, optionally, which CIDRs are used for the public and cluster networks. The controller bootstraps a Rook CephCluster (mon/mgr/osd) from this declaration.
ElasticStorageClass (esc) — declares a single Ceph pool plus the matching Kubernetes StorageClass, provisioned through the csi-ceph module. spec.replication (AvailabilityWithoutConsistency / ConsistencyAndAvailability / HighRedundancy) maps to a production-tested pool layout. References its parent ElasticCluster by name (spec.clusterRef).
ElasticClusterCredential (ecc) — internal cluster-scoped backup of the Ceph cluster identity (FSID, mon-secret, admin-secret), populated by the controller from the rook-ceph-mon Secret. Operators do not manage this CR directly; it exists so the cluster identity survives a d8-sds-elastic namespace re-create.

The module deploys the Rook Ceph operator, the rook-ceph-operator-config ConfigMap, the full set of Ceph CRDs and the three storage.deckhouse.io CRDs listed above.

Main Features

Single-CR cluster bootstrap from an ElasticCluster selecting BlockDevice / Node CRs by labels.
LVM-based OSD layout: per matched BlockDevice the controller provisions one LVMVolumeGroup, one LVMLogicalVolume, and one local PersistentVolume bound to the helm-managed sds-elastic-osd StorageClass (provisioner kubernetes.io/no-provisioner, volumeBindingMode: WaitForFirstConsumer). Rook consumes those PVs as OSDs through storageClassDeviceSets.
Three replication strategies per ElasticStorageClass: AvailabilityWithoutConsistency (2 replicas, min_size=1, requireSafeReplicaSize=false), ConsistencyAndAvailability (3 replicas, min_size=2, default), and HighRedundancy (4 replicas, min_size=2, requireSafeReplicaSize=true; tolerates two simultaneous host failures with continued I/O and one extra failure as a recovery margin; requires at least 5 storage nodes). The ErasureCodedCompact mode is temporarily disabled and cannot be selected.
RBD and CephFS pools per ElasticStorageClass — one ESC produces one Ceph pool plus one csi-ceph CephStorageClass named after the ESC.
Automatic csi-ceph wiring: the controller maintains a single CephClusterConnection (1:1 with the parent ElasticCluster) and one CephStorageClass per ElasticStorageClass; the user does not edit these vendor CRs by hand.
Identity backup via ElasticClusterCredential: FSID and mon/admin secrets are mirrored from the rook-ceph-mon Secret so the cluster identity survives a namespace re-create.

System Requirements

Deckhouse Kubernetes Platform of version 1.72 or later with at least three nodes for Ceph daemons (mon, mgr, OSD).
Two non-overlapping IPv4 CIDRs reachable from every storage node:
- one for the Ceph public network (client traffic)
- and one for the cluster network (replication and heartbeat).
  
  The same CIDR may be used for both if network separation is not needed; spec.network may also be omitted, in which case Rook listens on every host IP of the storage nodes (host networking).
The sds-node-configurator (≥ 0.6.8) module enabled. The module owns the BlockDevice and LVMVolumeGroup CRDs that ElasticCluster selects from and creates LVMVolumeGroups in.
The csi-ceph (≥ 0.5.26) module enabled. The module owns the CephClusterConnection and CephStorageClass CRDs the controller writes into.
The snapshot-controller module enabled for VolumeSnapshot support (optional).

Limitations

One Ceph cluster per module instance. The controller always reconciles a single Rook CephCluster named ceph-cluster in the d8-sds-elastic namespace, regardless of how many ElasticCluster objects are created. Multiple ElasticCluster objects compete for the same backend; create only one per cluster.
The vendored Rook operator registers its CRs under the renamed API group internal.sdselastic.deckhouse.io (upstream uses ceph.rook.io); this isolates sds-elastic from a user-installed Rook on the same cluster.
Direct edits of Rook (internal.sdselastic.deckhouse.io) resources in the d8-sds-elastic namespace are rejected by a validating webhook. All changes must go through ElasticCluster / ElasticStorageClass.
RGW and S3 buckets are not supported in this release. The Rook ObjectBucket (OBC) controller is disabled via ROOK_DISABLE_OBJECT_BUCKET_CLAIM=true, and the objectbucket.io CRDs are not vendored.
ElasticCluster.spec.network is immutable after creation (enforced by CEL in the CRD and mirrored by the validating webhook); changing public/cluster CIDRs on a live cluster invalidates mon endpoints and host-network bindings, so a network change requires deleting and re-creating the ElasticCluster.
ElasticCluster.spec.storage.nodeSelector and spec.storage.blockDeviceSelector are editable after creation. The validating webhook rejects narrowing edits that would orphan an already-adopted BlockDevice (the controller cannot safely release the LVG/LLV/PV plumbing without manual cleanup) and widening edits that would pull in a BlockDevice already owned by another ElasticCluster. See USAGE.md for the full ownership contract and the manual release procedure.
ElasticStorageClass.spec.{clusterRef,type,replication} are immutable after creation (enforced by CEL and the validating webhook). Replacing a pool requires creating a new ElasticStorageClass with a different name.
The name sds-elastic-osd is reserved for the helm-managed internal StorageClass; ElasticStorageClass resources with this metadata.name are rejected by the webhook.
ErasureCodedCompact is temporarily disabled: it is omitted from the spec.replication enum and rejected by the validating webhook, so it cannot be selected on any ElasticStorageClass.
replication: HighRedundancy requires at least 5 storage nodes (4 for the pool’s CRUSH placement at failureDomain=host and 5 to host a 5-mon quorum). The controller auto-promotes the underlying CephCluster to mon.count=5, mgr.count=3 while at least one HighRedundancy ESC is present, and keeps the promotion sticky after the trigger ESC is removed (silently weakening fault tolerance on a live cluster is unsafe). Operators can force a recompute by clearing ElasticCluster.status.cephTopology via the status subresource. The ESC validating webhook rejects CREATE of a HighRedundancy ESC when its parent ElasticCluster is absent, when fewer than 5 nodes match spec.storage.nodeSelector, or when the parent EC’s adopted BlockDevice resources span fewer than 4 distinct nodes — applying the EC and the HighRedundancy ESC in the same kubectl apply is therefore rejected, and the EC must be applied (and its OSD-host set populated) first.
Vendor Rook and Ceph CRDs are bundled with the module and are not user-configurable.
ElasticCluster / ElasticStorageClass teardown is finalizer-driven: deleting them removes the controller-managed Rook CephCluster / CephClusterConnection and the backing pools / filesystems plus their csi-ceph CephStorageClasses. The per-device objects (LVMVolumeGroup / LVMLogicalVolume / local PersistentVolume) and the BlockDevice ownership label are intentionally preserved and must be cleaned up manually; full OwnerReferences-driven GC of those is not yet wired (planned, B20 in the backlog). See USAGE.md.

Module sds-elastic

Main Features

System Requirements

Limitations

An error has occurred

Tell us what you didn’t like.

Module sds-elastic

Main Features

System Requirements

Limitations

An error has occurred

Tell us what you didn’t like.

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error