Description | ai-models

Available in: EE

The module lifecycle stage: Experimental
The module has requirements for installation

The module adds an AI/ML model file catalog to DKP and mounts those files into Kubernetes applications. A user creates a Model or ClusterModel with a model source. The controller receives files, verifies the format, packages them as an internal OCI artifact, stores a local copy in DMCR, and exposes progress in status.

The module does not run an inference runtime and is not limited to LLMs. It works with model artifacts in supported formats such as Safetensors, GGUF, and Diffusers. The application receives a model file path and decides how to use it.

Use Cases

Create a Model from Hugging Face, Ollama, or a local upload and receive a model file directory inside a Pod.
Create a ClusterModel so several teams can use one prepared model by a stable name.
Attach a model to a Deployment, StatefulSet, DaemonSet, or CronJob with one top-level metadata annotation.
Deliver a prepared model through SharedPVC when the cluster has ReadWriteMany storage.
Deliver a large model through NodeCache when several workloads read it on the same nodes.
Expose cluster-wide models in the Ready phase through a public catalog and import a local copy in another cluster.

Roles

Role	Responsibilities	Start Here
Cluster administrator	Enable the module, configure object storage, choose delivery mode, expose distribution, grant catalog access, monitor runtime health.	Administration Guide
Namespace user	Create or import models, upload local files, attach models in the `Ready` phase to workloads.	User Guide
Application operator	Keep workload manifests in Git and consume models through annotations.	User Guide

Resources

Resource	Scope	Purpose
`Model`	namespace	A model owned by one namespace.
`ClusterModel`	cluster	A shared model curated by a cluster administrator.
`ModelCatalogSource`	cluster	External `ClusterModel` catalog imported as local copies.

Model supports private Hugging Face sources through a Secret in the same namespace. ClusterModel is intended for shared remote sources and does not reference namespaced Secrets. ModelCatalogSource references source credentials in d8-system. Public catalog access is authorized by Kubernetes RBAC in the publishing cluster.

What Happens After Model Creation

  flowchart LR
  User["User"] --> API["Model / ClusterModel"]
  API --> Controller["Controller"]
  Controller --> Worker["Preparation worker"]
  Worker --> Store["OCI artifact in DMCR"]
  Controller --> Status["status / conditions / metadata"]
  Workload["Annotated workload"] --> Delivery["Delivery controller"]
  Delivery --> SharedPVC["RWX PVC"]
  Delivery --> NodeCache["NodeCache"]

Users choose only the model source: a URL, an upload session, or an external catalog entry. The controller chooses the internal DMCR path, verifies the data, packages the source files as an OCI artifact, and writes the result into status. The digest appears in status.artifact.digest after verification.

ModelPack is the module’s internal format for this OCI packaging. It is used for verification, replay after failure, cleanup, and repeatable delivery. This is not weight conversion: GGUF stays GGUF, and Safetensors stays Safetensors. Users do not choose ModelPack, a digest, a tag, or a registry path.

status.phase: Ready means the local model copy is verified and stored, and the controller can start workload delivery or catalog import. Workloads receive only the stable runtime contract:

model directory: /data/modelcache/models;
AI_MODELS_MODELS_DIR environment variable;
AI_MODELS_MODELS environment variable with model names, paths, digests, and families.

Delivery And Distribution

Delivery attaches a model in the Ready phase to a workload inside the cluster.

Delivery modes:

SharedPVC is the default mode. The controller creates a ReadWriteMany PVC in the workload namespace, a materializer Job downloads the model into it, and Pods mount the model read-only. A local RWO PVC is not a separate delivery mode.
NodeCache is intended for SDS-backed node-local cache. Selected nodes get a shared cache, and workloads receive a read-only CSI mount.

Distribution belongs to the catalog/import plane, not to workload delivery. It is used for DMZ, perimeter, or external verified catalog topologies: a publishing cluster exposes a list of ClusterModel objects in the Ready phase, a consuming cluster imports the selected model into its local DMCR, and only then uses normal delivery. Internal @sha256 values and OCI paths stay inside controller-owned copy workflows.

The public distribution surface is enabled with distribution.mode=PublicCatalog and uses the module public host from global Deckhouse settings. The publishing administrator grants access to a Kubernetes subject, usually a ServiceAccount, with ClusterRole d8:ai-models:distribution:reader. The consuming cluster stores that token in a d8-system Secret and describes the upstream with ModelCatalogSource.

Catalog import is recoverable for source-side failures. If a token expires, CA is fixed, or the source temporarily becomes not ready, the selected catalog revision and remote digest stay frozen and the controller retries the import after the source becomes healthy again.

Components

Component	Namespace	Purpose
`ai-models-controller`	`d8-ai-models`	Manages `Model` / `ClusterModel` resources, upload sessions, delivery, and metrics.
`publish-worker`	`d8-ai-models`	Reads model sources and stores verified OCI artifacts in DMCR.
`upload-gateway`	`d8-ai-models`	Accepts direct file or archive uploads.
`DMCR`	`d8-ai-models`	`Deckhouse Model Container Registry`: the module’s internal OCI registry that stores prepared models on top of the configured object storage.
`node-cache-runtime`	selected nodes	Prepares node-local cache and CSI mounts for `NodeCache`.

Documentation

User Guide — creating models and attaching them to workloads.
Administration Guide — enablement, storage, RBAC, monitoring, and operations.
Configuration — ModuleConfig settings.
Custom Resources — Model and ClusterModel.
Examples — ready-to-use YAML manifests.
FAQ — common questions and diagnostics.

Third-party components

List of third-party software used in the ai-models module:

AI Models 0.0.1
License: Apache License 2.0

Deckhouse module for AI/ML model registry and catalog services.

Overview

Use Cases

Roles

Resources

What Happens After Model Creation

Delivery And Distribution

Components

Documentation

Third-party components

An error has occurred

Tell us what you didn’t like.

Overview

Use Cases

Roles

Resources

What Happens After Model Creation

Delivery And Distribution

Components

Documentation

Third-party components

An error has occurred

Tell us what you didn’t like.

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error