The module lifecycle stage: General Availability
The module has requirements for installation
How do I work with GPU nodes?
Step-by-step procedure for adding a GPU node to the cluster
Starting with Deckhouse 1.75, if a NodeGroup contains the spec.gpu section, the gpu module automatically:
- configures containerd with
default_runtime = "nvidia"(via NodeGroupConfiguration); - applies the required system settings (including fixes for the NVIDIA Container Toolkit);
- deploys system components: NFD, GFD, NVIDIA Device Plugin, DCGM Exporter, and, if needed, MIG Manager.
Always specify the desired mode in spec.gpu.sharing (Exclusive, TimeSlicing, or MIG). Manual containerd configuration (via NodeGroupConfiguration, TOML, etc.) is not required and must not be combined with the automatic setup.
For the list of supported NVIDIA Container Toolkit platforms, see the official documentation.
To add a GPU node to the cluster, perform the following steps:
-
Create a NodeGroup for GPU nodes.
An example with TimeSlicing enabled (
partitionCount: 4) and typical taint/label:apiVersion: deckhouse.io/v1 kind: NodeGroup metadata: name: gpu spec: nodeType: CloudStatic # or Static/CloudEphemeral — depending on your infrastructure. gpu: sharing: TimeSlicing timeSlicing: partitionCount: 4 nodeTemplate: labels: node-role/gpu: "" taints: - key: node-role value: gpu effect: NoScheduleIf you use custom taint keys, ensure they are allowed in ModuleConfig
globalin the array.spec.settings.modules.placement.customTolerationKeysso workloads can add the correspondingtolerations.Full field schema: see NodeGroup CR documentation.
-
Install the NVIDIA driver and NVIDIA Container Toolkit (nvidia-container-toolkit).
Install the NVIDIA driver and NVIDIA Container Toolkit on the nodes—either manually or via a NodeGroupConfiguration. Below are NodeGroupConfiguration examples for the gpu NodeGroup.
Ubuntu
apiVersion: deckhouse.io/v1alpha1 kind: NodeGroupConfiguration metadata: name: install-cuda.sh spec: bundles: - ubuntu-lts content: | #!/bin/bash if [ ! -f "/etc/apt/sources.list.d/nvidia-container-toolkit.list" ]; then distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list fi bb-apt-install nvidia-container-toolkit nvidia-driver-535-server nvidia-ctk config --set nvidia-container-runtime.log-level=error --in-place nodeGroups: - gpu weight: 30CentOS
apiVersion: deckhouse.io/v1alpha1 kind: NodeGroupConfiguration metadata: name: install-cuda.sh spec: bundles: - centos content: | #!/bin/bash if [ ! -f "/etc/yum.repos.d/nvidia-container-toolkit.repo" ]; then distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo fi bb-dnf-install nvidia-container-toolkit nvidia-driver nvidia-ctk config --set nvidia-container-runtime.log-level=error --in-place nodeGroups: - gpu weight: 30After these configurations are applied, perform bootstrap and reboot the nodes so that settings are applied and the drivers get installed.
-
Verify installation on the node using the command:
nvidia-smiMake sure the GPU is not used by third-party processes: before running user workloads, the
Processessection of thenvidia-smioutput must not contain any processes using the GPU.On nodes with a graphical environment, the GPU can be used, for example, by graphical session processes or a display manager:
Xorg,gnome-shell,gdm,sddm,lightdm, and others. Such processes can occupy GPU memory and interfere with workloads, as well as with applying the MIG configuration.Expected healthy output (example):
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.247.01 Driver Version: 535.247.01 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Tesla V100-PCIE-32GB Off | 00000000:65:00.0 Off | 0 | | N/A 32C P0 35W / 250W | 0MiB / 32768MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ -
Verify infrastructure components in the cluster.
Device Plugin mode — NVIDIA Pods in
d8-nvidia-gpu:d8 k -n d8-nvidia-gpu get podExpected healthy output (example):
NAME READY STATUS RESTARTS AGE gpu-feature-discovery-80ceb7d-r842q 2/2 Running 0 2m53s nvidia-dcgm-exporter-w9v9h 1/1 Running 0 2m53s nvidia-dcgm-njqqb 1/1 Running 0 2m53s nvidia-device-plugin-80ceb7d-8xt8g 2/2 Running 0 2m53sNFD Pods in
d8-nvidia-gpu:d8 k -n d8-nvidia-gpu get pods | egrep '^(NAME|node-feature-discovery)'Expected healthy output (example):
NAME READY STATUS RESTARTS AGE node-feature-discovery-gc-6d845765df-45vpj 1/1 Running 0 3m6s node-feature-discovery-master-74696fd9d5-wkjk4 1/1 Running 0 3m6s node-feature-discovery-worker-5f4kv 1/1 Running 0 3m8sDRA mode — DRA Pods in
d8-nvidia-gpu:d8 k -n d8-nvidia-gpu get podExpected healthy output (example):
NAME READY STATUS RESTARTS AGE gpu-controller-7d9f8b6c4-xk2lp 2/2 Running 0 5m gpu-node-agent-q8tnz 1/1 Running 0 5mResource exposure on the node:
d8 k describe node <node-name>Output snippet (example):
Capacity: cpu: 40 memory: 263566308Ki nvidia.com/gpu: 4 Allocatable: cpu: 39930m memory: 262648294441 nvidia.com/gpu: 4 -
Run functional tests.
Option A. Invoke
nvidia-smifrom inside a container:apiVersion: batch/v1 kind: Job metadata: name: nvidia-cuda-test namespace: default spec: completions: 1 template: spec: restartPolicy: Never nodeSelector: node.deckhouse.io/group: gpu containers: - name: nvidia-cuda-test image: nvidia/cuda:11.6.2-base-ubuntu20.04 imagePullPolicy: "IfNotPresent" command: - nvidia-smiCheck the logs:
d8 k logs job/nvidia-cuda-testOption B. CUDA sample (vectoradd):
apiVersion: batch/v1 kind: Job metadata: name: gpu-operator-test namespace: default spec: completions: 1 template: spec: restartPolicy: Never nodeSelector: node.deckhouse.io/group: gpu containers: - name: gpu-operator-test image: nvidia/samples:vectoradd-cuda10.2 imagePullPolicy: "IfNotPresent"
How to switch to DRA mode?
Set dra.enabled: true in ModuleConfig:
apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
name: gpu
spec:
enabled: true
version: 1
settings:
dra:
enabled: trueRequirements: Kubernetes ≥ 1.34. The module automatically removes the Device Plugin stack from d8-nvidia-gpu and deploys the DRA stack into the same namespace. No manual cleanup is required.
To verify the DRA stack is healthy:
d8 k get module gpu -o jsonpath='{.status.phase}'
d8 k -n d8-nvidia-gpu get deploy,dsIncompatible strategy detected auto in nvidia-device-plugin/gpu-feature-discovery logs
Errors like:
Incompatible strategy detected autofailed to create resource manager: unsupported strategy autoinvalid device discovery strategy
mean the component cannot detect the NVML platform inside the container (typically libnvidia-ml.so.* is not available / NVIDIA Container Toolkit runtime is not in use).
What to check:
nvidia-smiworks on the node.- NVIDIA Container Toolkit is installed (
/usr/bin/nvidia-container-runtimeexists). - containerd is configured to use the
nvidiaruntime on GPU nodes (thegpumodule does this after the driver/toolkit installation and a containerd restart/node reboot). - After fixing, recreate
nvidia-device-plugin-*andgpu-feature-discovery-*Pods in thed8-nvidia-gpunamespace.
How to monitor GPUs?
Deckhouse Kubernetes Platform automatically deploys DCGM Exporter; GPU metrics are scraped by Prometheus and available in Grafana.
Which GPU modes are supported?
- Exclusive — the node exposes the
nvidia.com/gpuresource; each Pod receives an entire GPU. - TimeSlicing — time-sharing a single GPU among multiple Pods (default
partitionCount: 4); Pods still requestnvidia.com/gpu. - MIG (Multi-Instance GPU) — hardware partitioning of supported GPUs into independent instances; with the
all-1g.5gbprofile the cluster exposes resources likenvidia.com/mig-1g.5gb.
See GPU module examples.
How to view available MIG profiles in the cluster?
Pre-defined profiles are stored in the mig-parted-config ConfigMap inside the d8-nvidia-gpu namespace and can be viewed with:
d8 k -n d8-nvidia-gpu get cm mig-parted-config -o json | jq -r '.data["config.yaml"]'The mig-configs: section lists the GPU models (by PCI ID) and the MIG profiles each card supports (e.g., all-1g.5gb, all-2g.10gb, all-balanced). Select the profile that matches your accelerator and set its name in spec.gpu.mig.partedConfig of the NodeGroup.
How to define a custom MIG profile per GPU on a node?
Use partedConfig: custom and describe MIG partitioning per GPU index:
gpu:
sharing: MIG
mig:
partedConfig: custom
customConfigs:
- index: 0
slices:
- profile: "1g.10gb"
count: 7
- index: 1
slices:
- profile: "2g.20gb"
count: 3What the module does:
- Generates a unique MIG config name for the NodeGroup and sets it in the
nvidia.com/mig.configlabel. - For GPUs listed in
customConfigs, rendersmig-enabled: truewith the declaredslices. - For all unspecified indexes (all remaining GPUs on the node), renders
mig-enabled: false, so those cards remain in full mode and do not override explicitly configured GPUs.
MIG profile does not activate — what to check?
-
Check the GPU model. MIG is supported in the H100/A100/A30 models and not supported in V100/T4. To verify the support in a model, refer to profile tables in the NVIDIA MIG guide.
-
Ensure the GPU is not being used by OS processes or user applications. If a graphical environment, display manager, or other GPU-consuming processes are running on the node, applying the MIG configuration may fail or may not take effect until the GPU is released. Check this with the following command:
nvidia-smi -
Check the NodeGroup configuration:
gpu: sharing: MIG mig: partedConfig: all-1g.5gb -
Wait until
nvidia-mig-managercompletes the drain of the node and reconfigures the GPU. This process can take several minutes. While it is running, the node is tainted withmig-reconfigure. When the operation succeeds, that taint is removed. -
Track the progress via the
nvidia.com/mig.config.statelabel on the node:pending,rebooting,success(orfailedif something goes wrong). -
If
nvidia.com/mig-*resources are still missing, check:d8 k -n d8-nvidia-gpu logs daemonset/nvidia-mig-manager nvidia-smi -L
Are AMD or Intel GPUs supported?
At this time, Deckhouse Kubernetes Platform automatically configures NVIDIA GPUs only. Support for AMD (ROCm) and Intel GPUs is being worked on and is planned for future releases.