The module lifecycle stagePreview
The module has requirements for installation

How to convert existing dashboards from GrafanaDashboardDefinition

To migrate from the old dashboard format (GrafanaDashboardDefinition) to the new ones (ObservabilityDashboard, ClusterObservabilityDashboard), you need to manually adapt the manifests. Note the following differences:

Old Format New Format
spec.folder This field is removed. The folder is now specified using the annotation: observability.deckhouse.io/category
Dashboard title is taken from the JSON title The title is set via the annotation: observability.deckhouse.io/title. If the annotation is missing, the title field from the JSON is used

Conversion example

Old format:

apiVersion: deckhouse.io/v1
kind: GrafanaDashboardDefinition
metadata:
  name: example-dashboard
spec:
  folder: "Apps"
  json: '{
    "title": "Example Dashboard",
    ...
  }'

New format (ObservabilityDashboard):

apiVersion: observability.deckhouse.io/v1alpha1
kind: ObservabilityDashboard
metadata:
  name: example-dashboard
  namespace: my-namespace
  annotations:
    metadata.deckhouse.io/category: "Apps"
    metadata.deckhouse.io/title: "Example Dashboard"
spec:
  definition: |
    {
      "title": "Example Dashboard",
      ...
    }

New format (ClusterObservabilityDashboard):

apiVersion: observability.deckhouse.io/v1alpha1
kind: ClusterObservabilityDashboard
metadata:
  name: example-dashboard
  annotations:
    metadata.deckhouse.io/category: "Apps"
    metadata.deckhouse.io/title: "Example Dashboard"
spec:
  definition: |
    {
      "title": "Example Dashboard",
      ...
    }

How to grant access to metrics and dashboards in a specific namespace

To grant access to metrics and dashboards in a specific namespace, you need to create a ClusterRole and RoleBinding that define the user’s permissions. Access to metrics and dashboards is granted separately:

  • Metrics — access is checked via the get permission on the metrics.observability.deckhouse.io resource.
  • Dashboards — access is checked via the following permissions on the observabilitydashboards.observability.deckhouse.io resource:
    • get — view dashboards;
    • create — create, update, and delete dashboards.

Example of ClusterRole and RoleBinding for read-only access

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: observability-viewer
rules:
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["metrics", "observabilitydashboards"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bind-observability-viewer
  namespace: my-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: observability-viewer
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io

Example of ClusterRole and RoleBinding for read and write access

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: observability-editor
rules:
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["metrics", "observabilitydashboards"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["observabilitydashboards"]
    verbs: ["create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bind-observability-editor
  namespace: my-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: observability-editor
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io

How to grant access to system metrics and dashboards

To grant access to system metrics and dashboards, you need to create a ClusterRole and ClusterRoleBinding that define the user’s permissions. Access to metrics and dashboards is granted separately:

  • Metrics — access is checked via the get permission on the clustermetrics.observability.deckhouse.io resource.
  • Dashboards — access is checked via the following permissions on the clusterobservabilitydashboards.observability.deckhouse.io resource:
    • get — view dashboards;
    • create — create, update, and delete dashboards.

Example of ClusterRole and ClusterRoleBinding for read-only access

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: observability-cluster-viewer
rules:
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["clustermetrics", "clusterobservabilitydashboards"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bind-observability-cluster-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: observability-cluster-viewer
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io

Example of ClusterRole and ClusterRoleBinding for read and write access

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: observability-cluster-editor
rules:
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["clustermetrics", "clusterobservabilitydashboards"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["observability.deckhouse.io"]
    resources: ["clusterobservabilitydashboards"]
    verbs: ["create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bind-observability-cluster-editor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: observability-cluster-editor
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io

How to grant full access to all metrics and dashboards

To grant full access to all metrics and dashboards in Deckhouse, create a ClusterRole with all necessary permissions and bind it to a user or group via ClusterRoleBinding.

Example of ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: observability-admin
rules:
  - apiGroups: ["observability.deckhouse.io"]
    resources:
      - metrics
      - clustermetrics
      - observabilitydashboards
      - clusterobservabilitydashboards
      - clusterobservabilitypropagateddashboards
    verbs: ["get", "list", "watch", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bind-observability-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: observability-admin
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io

You can also use the built-in cluster-admin role, but it should be used with caution, as it grants full access to all cluster resources.

How to grant access using RBAC 2.0

If the experimental role model is enabled, permissions are assigned using UserRole and ClusterUserRole resources.

Example of access to metrics and dashboards in a specific namespace

To grant a user access to the myapp namespace with permission to view metrics and dashboards, use the following manifest:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: myapp-developer
  namespace: myapp
subjects:
  - kind: User
    name: user@example.com
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: d8:use:role:user
  apiGroup: rbac.authorization.k8s.io

This example grants broader permissions beyond just access to dashboards and metrics. See the user-authz module documentation for details on this role.

How to provide external read access to project metrics

To provide external access to the project metrics, follow these steps:

  1. Enable external access to metrics. To do that, enable the spec.settings.externalMetricsAccess parameter in the observability module settings.

  2. Create a ServiceAccount for request authorization:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: metrics-access
      namespace: my-namespace
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: metrics-access
      annotations:
        kubernetes.io/service-account.name: metrics-access
    type: kubernetes.io/service-account-token
  3. Grant read access to metrics to the created ServiceAccount using Role and RoleBinding resources:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      namespace: my-namespace
      name: metrics-access
    rules:
      - apiGroups: ["observability.deckhouse.io"]
        resources: ["metrics"]
        verbs: ["get", "watch", "list"]
      - apiGroups: [""]
        resources: ["namespaces"]
        verbs: ["get", "watch", "list"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: metrics-access
      namespace: my-namespace
    subjects:
      - kind: ServiceAccount
        name: metrics-access
        namespace: my-namespace
    roleRef:
      kind: Role
      name: metrics-access
      apiGroup: rbac.authorization.k8s.io
  4. Retrieve an authorization token for accessing metrics. When the ServiceAccount was created, a Secret containing its token was also generated. This token is stored as a Base64-encoded value. To extract and decode it, run the following command:

    d8 k -n my-namespace get secret metrics-access -ojsonpath='{ .data.token }' | base64 -d

    You will need this token in the next step when configuring the Grafana data source.

  5. Configure access to metrics in an external Grafana instance. Add a Prometheus data source with the following parameters:

    • Name: Any arbitrary name for the data source.
    • URL: External metrics endpoint in the https://observability.%publicDomainTemplate%/<prefix>, where:
      • %publicDomainTemplate%: Domain template of your cluster, defined in the global settings of Deckhouse Kubernetes Platform.
      • <prefix>: One of the following Prometheus prefixes:
        • /metrics/main: For the primary Prometheus instance.
        • /metrics/longterm: For the Prometheus Longterm instance.
    • HTTP Headers: Additional HTTP headers for authorization:
      • Header: Authorization
      • Value: Bearer <TOKEN_VALUE>, where <TOKEN_VALUE> is the token obtained from the Secret in the previous step.

How to write metrics outside of cluster

Please perform following steps to get access to metrics:

  1. Enable external metrics access. Enable spec.settings.externalMetricsAccess in observability module settings.

  2. Create ServiceAccount for requests authorization.

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: metrics-access
      namespace: my-namespace
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: metrics-access
      namespace: my-namespace
      annotations:
        kubernetes.io/service-account.name: metrics-access
    type: kubernetes.io/service-account-token
  3. Add a Role and RoleBinding to provide read metrics permission for a created ServiceAccount:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      namespace: my-namespace
      name: metrics-access
    rules:
      - apiGroups: ["observability.deckhouse.io"]
        resources: ["metrics"]
        verbs: ["create"]
      - apiGroups: [""]
        resources: ["namespaces"]
        verbs: ["get", "watch", "list"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: metrics-access
      namespace: my-namespace
    subjects:
      - kind: ServiceAccount
        name: metrics-access
        namespace: my-namespace
    roleRef:
      kind: Role
      name: metrics-access
      apiGroup: rbac.authorization.k8s.io
  4. Get authorization token. A Secret containing an authorization token was created along with ServiceAccount. Token stored in Secret is saved with base64 encoding. This token may be used to access metrics. Use following command to get the token:

    kubectl -n my-namespace get secret metrics-access -ojsonpath='{ .data.token }' | base64 -d

    This token will be required on the next step to set up Grafana datasource.

  5. Send metrics using Prometheus Remote-Write V1 or V2 messages:

    • URL: https://observability.%publicDomainTemplate%/api/v1/write. publicDomainTemplate details.
    • HTTP Headers:
      • Header: Authorization
      • Value: Bearer <TOKEN_VALUE>, with token obtained from metrics-access Secret in the previous step.

How to provide external read access to cluster metrics

To provide external access to the cluster metrics, follow these steps:

  1. Enable external access to metrics. To do that, enable the spec.settings.externalMetricsAccess parameter in the observability module settings.

  2. Create a ServiceAccount for request authorization:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: cluster-metrics-access
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: cluster-metrics-access
      annotations:
        kubernetes.io/service-account.name: cluster-metrics-access
    type: kubernetes.io/service-account-token
  3. Grant read access to metrics to the created ServiceAccount using ClusterRole and ClusterRoleBinding resources:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: observability-cluster-metrics-viewer
    rules:
      - apiGroups: ["observability.deckhouse.io"]
        resources: ["clustermetrics"]
        verbs: ["get", "list", "watch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: bind-observability-cluster-metrics-viewer
    subjects:
      - kind: ServiceAccount
        name: cluster-metrics-access
        namespace: default
    roleRef:
      kind: ClusterRole
      name: observability-cluster-metrics-viewer
      apiGroup: rbac.authorization.k8s.io
  4. Retrieve an authorization token for accessing metrics. When the ServiceAccount was created, a Secret containing its token was also generated. This token is stored as a Base64-encoded value. To extract and decode it, run the following command:

    d8 k -n my-namespace get secret cluster-metrics-access -ojsonpath='{ .data.token }' | base64 -d

    You will need this token in the next step when configuring the Grafana data source.

  5. Configure access to metrics in an external Grafana instance. Add a Prometheus data source with the following parameters:

    • Name: Any arbitrary name for the data source.
    • URL: External metrics endpoint in the https://observability.%publicDomainTemplate%/<prefix>, where:
      • %publicDomainTemplate%: Domain template of your cluster, defined in the global settings of Deckhouse Kubernetes Platform.
      • <prefix>: One of the following Prometheus prefixes:
        • /metrics/main: For the primary Prometheus instance.
        • /metrics/longterm: For the Prometheus Longterm instance.
    • HTTP Headers: Additional HTTP headers for authorization:
      • Header: Authorization
      • Value: Bearer <TOKEN_VALUE>, where <TOKEN_VALUE> is the token obtained from the Secret in the previous step.

What are DeadMansSwitch and PrometheusUnavailable alerts?

DeadMansSwitch

DeadMansSwitch is a heartbeat alert that fires continuously while Prometheus is healthy and the alerting pipeline is functional. It is sent to all configured notification channels by default (unless filtered out by label matchers in notification policies).

The DeadMansSwitch alert is hidden from kubectl get clusterobservabilityalerts (list/watch) output to avoid cluttering the alerts list. However, it can still be retrieved directly via kubectl get clusterobservabilityalert <name>.

DeadMansSwitch can be disabled via the deadMansSwitch.enabled setting in the observability ModuleConfig. When disabled, no DeadMansSwitch or PrometheusUnavailable alerts are generated.

PrometheusUnavailable

PrometheusUnavailable (formerly MissingDeadMansSwitch) is an alert that automatically generated when the DeadMansSwitch heartbeat is not received for more than 2 minutes. This indicates that the entire alerting pipeline is not functional — Prometheus may be down, the connection between Prometheus and alertmanager may be broken, or another issue is preventing alerts from being delivered.

PrometheusUnavailable is a cluster-scoped alert and is visible in both the UI and kubectl get clusterobservabilityalerts.