Examples

The module lifecycle stage: Experimental
The module has requirements for installation

Use this page as a copy-paste manifest catalog. The user and administrator guides explain when to choose each scenario and how to diagnose failures.

Minimal ModuleConfig

apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
  name: ai-models
spec:
  enabled: true
  version: 1
  settings:
    artifacts:
      bucket: ai-models
      endpoint: https://s3.example.com
      region: us-east-1
      credentialsSecretName: ai-models-artifacts
      usePathStyle: true

ModuleConfig With Capacity Limit

apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
  name: ai-models
spec:
  enabled: true
  version: 1
  settings:
    artifacts:
      bucket: ai-models
      endpoint: https://s3.example.com
      region: us-east-1
      credentialsSecretName: ai-models-artifacts
      capacityLimit: 1Ti

SharedPVC Delivery

apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
  name: ai-models
spec:
  enabled: true
  version: 1
  settings:
    artifacts:
      bucket: ai-models
      endpoint: https://s3.example.com
      credentialsSecretName: ai-models-artifacts
    delivery:
      type: SharedPVC
      sharedPVCStorageClassName: rwx-storage-class

NodeCache Delivery

d8 k label node k8s-w3-gpu ai.deckhouse.io/model-cache=true
d8 k label blockdevice <block-device-name> ai.deckhouse.io/model-cache=true

apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
  name: ai-models
spec:
  enabled: true
  version: 1
  settings:
    artifacts:
      bucket: ai-models
      endpoint: https://s3.example.com
      credentialsSecretName: ai-models-artifacts
    delivery:
      type: NodeCache
      nodeCacheSize: 200Gi

Hugging Face Model

apiVersion: ai.deckhouse.io/v1alpha1
kind: Model
metadata:
  name: bge-m3
  namespace: ai-demo
spec:
  source:
    url: https://huggingface.co/BAAI/bge-m3

Private Hugging Face Model

apiVersion: v1
kind: Secret
metadata:
  name: hf-token
  namespace: ai-demo
type: Opaque
stringData:
  token: hf_xxx
---
apiVersion: ai.deckhouse.io/v1alpha1
kind: Model
metadata:
  name: private-model
  namespace: ai-demo
spec:
  source:
    url: https://huggingface.co/acme/private-model
    authSecretRef:
      name: hf-token

ClusterModel

apiVersion: ai.deckhouse.io/v1alpha1
kind: ClusterModel
metadata:
  name: gemma-small
spec:
  source:
    url: https://huggingface.co/google/gemma-3-4b-it

GGUF Model From Ollama

apiVersion: ai.deckhouse.io/v1alpha1
kind: ClusterModel
metadata:
  name: qwen-gguf
spec:
  source:
    url: https://ollama.com/library/qwen3.6:latest

Upload Model

apiVersion: ai.deckhouse.io/v1alpha1
kind: Model
metadata:
  name: uploaded-safetensors
  namespace: ai-demo
spec:
  source:
    upload: {}

d8 k -n ai-demo wait --for=jsonpath='{.status.phase}'=WaitForUpload model/uploaded-safetensors
UPLOAD_SECRET=$(d8 k -n ai-demo get model uploaded-safetensors -o jsonpath='{.status.upload.secretName}')
UPLOAD_URL=$(d8 k -n ai-demo get secret "$UPLOAD_SECRET" -o jsonpath='{.data.url}' | base64 -d)
UPLOAD_TOKEN=$(d8 k -n ai-demo get secret "$UPLOAD_SECRET" -o jsonpath='{.data.token}' | base64 -d)
curl -fS --progress-bar -H "Authorization: Bearer ${UPLOAD_TOKEN}" -T ./model-bundle.zip "$UPLOAD_URL?filename=model-bundle.zip" | cat

Deployment With Model

apiVersion: apps/v1
kind: Deployment
metadata:
  name: embedder
  namespace: ai-demo
  annotations:
    ai.deckhouse.io/model: bge-m3
spec:
  replicas: 2
  selector:
    matchLabels:
      app: embedder
  template:
    metadata:
      labels:
        app: embedder
    spec:
      containers:
        - name: embedder
          image: registry.example.com/embedder:latest

Deployment With ClusterModel

apiVersion: apps/v1
kind: Deployment
metadata:
  name: generator
  namespace: ai-demo
  annotations:
    ai.deckhouse.io/clustermodel: gemma-small
spec:
  selector:
    matchLabels:
      app: generator
  template:
    metadata:
      labels:
        app: generator
    spec:
      containers:
        - name: generator
          image: registry.example.com/generator:latest

Workload With Multiple Models

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rag-service
  namespace: ai-demo
  annotations:
    ai.deckhouse.io/clustermodel: gemma-small
    ai.deckhouse.io/model: bge-m3
spec:
  selector:
    matchLabels:
      app: rag-service
  template:
    metadata:
      labels:
        app: rag-service
    spec:
      containers:
        - name: rag-service
          image: registry.example.com/rag-service:latest

Inside the container, models are available under /data/modelcache/models/<model-name>.

Perimeter Distribution Tier

Perimeter distribution is a separate catalog/import axis, not a delivery.type value. Enable public catalog mode in the publishing tier:

apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
  name: ai-models
spec:
  enabled: true
  version: 1
  settings:
    artifacts:
      bucket: ai-models
      endpoint: https://s3.example.com
      credentialsSecretName: ai-models-artifacts
    distribution:
      mode: PublicCatalog

All ClusterModel objects in the Ready phase appear in the public catalog. Create a ServiceAccount identity for the consuming cluster in the publishing cluster and bind it to the distribution reader role:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: perimeter-a
  namespace: d8-ai-models
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ai-models-distribution-reader-perimeter-a
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: d8:ai-models:distribution:reader
subjects:
  - kind: ServiceAccount
    name: perimeter-a
    namespace: d8-ai-models

Issue a token in the publishing cluster and pass it to the consuming administrator:

d8 k -n d8-ai-models create token perimeter-a --duration=720h

On the consuming cluster, create the source Secret in d8-system and describe the external catalog source:

apiVersion: v1
kind: Secret
metadata:
  name: ai-models-dmz-read
  namespace: d8-system
type: Opaque
stringData:
  token: "<publishing-cluster-service-account-token>"
---
apiVersion: v1
kind: Secret
metadata:
  name: ai-models-dmz-ca
  namespace: d8-system
type: Opaque
stringData:
  ca.crt: |
    -----BEGIN CERTIFICATE-----
    ...
    -----END CERTIFICATE-----
---
apiVersion: ai.deckhouse.io/v1alpha1
kind: ModelCatalogSource
metadata:
  name: dmz
spec:
  url: https://ai-models.dmz.example.com
  credentialsSecretName: ai-models-dmz-read
  caSecretName: ai-models-dmz-ca
---
apiVersion: ai.deckhouse.io/v1alpha1
kind: Model
metadata:
  name: qwen3-8b
  namespace: ai-demo
spec:
  source:
    catalog:
      sourceName: dmz
      name: qwen3-8b

If the cluster has only one ready ModelCatalogSource, sourceName can be omitted.

Workload delivery in the consuming cluster remains a regular delivery mode:

delivery:
  type: SharedPVC

or NodeCache when node-local cache is required.

Minimal ModuleConfig

ModuleConfig With Capacity Limit

SharedPVC Delivery

NodeCache Delivery

Hugging Face Model

Private Hugging Face Model

ClusterModel

GGUF Model From Ollama

Upload Model

Deployment With Model

Deployment With ClusterModel

Workload With Multiple Models

Perimeter Distribution Tier

An error has occurred

Tell us what you didn’t like.

Examples

Minimal ModuleConfig

ModuleConfig With Capacity Limit

SharedPVC Delivery

NodeCache Delivery

Hugging Face Model

Private Hugging Face Model

ClusterModel

GGUF Model From Ollama

Upload Model

Deployment With Model

Deployment With ClusterModel

Workload With Multiple Models

Perimeter Distribution Tier

An error has occurred

Tell us what you didn’t like.

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error