The module lifecycle stage: Preview
The module has requirements for installation
How to convert existing dashboards from GrafanaDashboardDefinition
To migrate from the old dashboard format (GrafanaDashboardDefinition) to the new ones (ObservabilityDashboard, ClusterObservabilityDashboard), you need to manually adapt the manifests. Note the following differences:
| Old Format | New Format | |
|---|---|---|
spec.folder |
This field is removed. The folder is now specified using the annotation: observability.deckhouse.io/category |
|
| Dashboard title is taken from the JSON title | The title is set via the annotation: observability.deckhouse.io/title. If the annotation is missing, the title field from the JSON is used |
Conversion example
Old format:
apiVersion: deckhouse.io/v1
kind: GrafanaDashboardDefinition
metadata:
name: example-dashboard
spec:
folder: "Apps"
json: '{
"title": "Example Dashboard",
...
}'New format (ObservabilityDashboard):
apiVersion: observability.deckhouse.io/v1alpha1
kind: ObservabilityDashboard
metadata:
name: example-dashboard
namespace: my-namespace
annotations:
metadata.deckhouse.io/category: "Apps"
metadata.deckhouse.io/title: "Example Dashboard"
spec:
definition: |
{
"title": "Example Dashboard",
...
}New format (ClusterObservabilityDashboard):
apiVersion: observability.deckhouse.io/v1alpha1
kind: ClusterObservabilityDashboard
metadata:
name: example-dashboard
annotations:
metadata.deckhouse.io/category: "Apps"
metadata.deckhouse.io/title: "Example Dashboard"
spec:
definition: |
{
"title": "Example Dashboard",
...
}How to grant access to metrics and dashboards in a specific namespace
To grant access to metrics and dashboards in a specific namespace, you need to create a ClusterRole and RoleBinding that define the user’s permissions. Access to metrics and dashboards is granted separately:
- Metrics — access is checked via the
getpermission on themetrics.observability.deckhouse.ioresource. - Dashboards — access is checked via the following permissions on the
observabilitydashboards.observability.deckhouse.ioresource:get— view dashboards;create— create, update, and delete dashboards.
Example of ClusterRole and RoleBinding for read-only access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: observability-viewer
rules:
- apiGroups: ["observability.deckhouse.io"]
resources: ["metrics", "observabilitydashboards"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: bind-observability-viewer
namespace: my-namespace
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: observability-viewer
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.ioExample of ClusterRole and RoleBinding for read and write access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: observability-editor
rules:
- apiGroups: ["observability.deckhouse.io"]
resources: ["metrics", "observabilitydashboards"]
verbs: ["get", "list", "watch"]
- apiGroups: ["observability.deckhouse.io"]
resources: ["observabilitydashboards"]
verbs: ["create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: bind-observability-editor
namespace: my-namespace
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: observability-editor
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.ioHow to grant access to system metrics and dashboards
To grant access to system metrics and dashboards, you need to create a ClusterRole and ClusterRoleBinding that define the user’s permissions. Access to metrics and dashboards is granted separately:
- Metrics — access is checked via the
getpermission on theclustermetrics.observability.deckhouse.ioresource. - Dashboards — access is checked via the following permissions on the
clusterobservabilitydashboards.observability.deckhouse.ioresource:get— view dashboards;create— create, update, and delete dashboards.
Example of ClusterRole and ClusterRoleBinding for read-only access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: observability-cluster-viewer
rules:
- apiGroups: ["observability.deckhouse.io"]
resources: ["clustermetrics", "clusterobservabilitydashboards"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: bind-observability-cluster-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: observability-cluster-viewer
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.ioExample of ClusterRole and ClusterRoleBinding for read and write access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: observability-cluster-editor
rules:
- apiGroups: ["observability.deckhouse.io"]
resources: ["clustermetrics", "clusterobservabilitydashboards"]
verbs: ["get", "list", "watch"]
- apiGroups: ["observability.deckhouse.io"]
resources: ["clusterobservabilitydashboards"]
verbs: ["create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: bind-observability-cluster-editor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: observability-cluster-editor
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.ioHow to grant full access to all metrics and dashboards
To grant full access to all metrics and dashboards in Deckhouse, create a ClusterRole with all necessary permissions and bind it to a user or group via ClusterRoleBinding.
Example of ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: observability-admin
rules:
- apiGroups: ["observability.deckhouse.io"]
resources:
- metrics
- clustermetrics
- observabilitydashboards
- clusterobservabilitydashboards
- clusterobservabilitypropagateddashboards
verbs: ["get", "list", "watch", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: bind-observability-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: observability-admin
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.ioYou can also use the built-in
cluster-adminrole, but it should be used with caution, as it grants full access to all cluster resources.
How to grant access using RBAC 2.0
If the experimental role model is enabled, permissions are assigned using UserRole and ClusterUserRole resources.
Example of access to metrics and dashboards in a specific namespace
To grant a user access to the myapp namespace with permission to view metrics and dashboards, use the following manifest:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: myapp-developer
namespace: myapp
subjects:
- kind: User
name: user@example.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: d8:use:role:user
apiGroup: rbac.authorization.k8s.ioThis example grants broader permissions beyond just access to dashboards and metrics. See the user-authz module documentation for details on this role.
How to provide external read access to project metrics
To provide external access to the project metrics, follow these steps:
-
Enable external access to metrics. To do that, enable the
spec.settings.externalMetricsAccessparameter in theobservabilitymodule settings. -
Create a ServiceAccount for request authorization:
apiVersion: v1 kind: ServiceAccount metadata: name: metrics-access namespace: my-namespace --- apiVersion: v1 kind: Secret metadata: name: metrics-access annotations: kubernetes.io/service-account.name: metrics-access type: kubernetes.io/service-account-token -
Grant read access to metrics to the created ServiceAccount using Role and RoleBinding resources:
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: my-namespace name: metrics-access rules: - apiGroups: ["observability.deckhouse.io"] resources: ["metrics"] verbs: ["get", "watch", "list"] - apiGroups: [""] resources: ["namespaces"] verbs: ["get", "watch", "list"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: metrics-access namespace: my-namespace subjects: - kind: ServiceAccount name: metrics-access namespace: my-namespace roleRef: kind: Role name: metrics-access apiGroup: rbac.authorization.k8s.io -
Retrieve an authorization token for accessing metrics. When the ServiceAccount was created, a Secret containing its token was also generated. This token is stored as a Base64-encoded value. To extract and decode it, run the following command:
d8 k -n my-namespace get secret metrics-access -ojsonpath='{ .data.token }' | base64 -dYou will need this token in the next step when configuring the Grafana data source.
-
Configure access to metrics in an external Grafana instance. Add a Prometheus data source with the following parameters:
Name: Any arbitrary name for the data source.URL: External metrics endpoint in thehttps://observability.%publicDomainTemplate%/<prefix>, where:%publicDomainTemplate%: Domain template of your cluster, defined in the global settings of Deckhouse Kubernetes Platform.<prefix>: One of the following Prometheus prefixes:/metrics/main: For the primary Prometheus instance./metrics/longterm: For the Prometheus Longterm instance.
HTTP Headers: Additional HTTP headers for authorization:Header:AuthorizationValue:Bearer <TOKEN_VALUE>, where <TOKEN_VALUE> is the token obtained from the Secret in the previous step.
How to write metrics outside of cluster
Please perform following steps to get access to metrics:
-
Enable external metrics access. Enable spec.settings.externalMetricsAccess in observability module settings.
-
Create
ServiceAccountfor requests authorization.apiVersion: v1 kind: ServiceAccount metadata: name: metrics-access namespace: my-namespace --- apiVersion: v1 kind: Secret metadata: name: metrics-access namespace: my-namespace annotations: kubernetes.io/service-account.name: metrics-access type: kubernetes.io/service-account-token -
Add a
RoleandRoleBindingto provide read metrics permission for a createdServiceAccount:apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: my-namespace name: metrics-access rules: - apiGroups: ["observability.deckhouse.io"] resources: ["metrics"] verbs: ["create"] - apiGroups: [""] resources: ["namespaces"] verbs: ["get", "watch", "list"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: metrics-access namespace: my-namespace subjects: - kind: ServiceAccount name: metrics-access namespace: my-namespace roleRef: kind: Role name: metrics-access apiGroup: rbac.authorization.k8s.io -
Get authorization token. A
Secretcontaining an authorization token was created along withServiceAccount. Token stored inSecretis saved with base64 encoding. This token may be used to access metrics. Use following command to get the token:kubectl -n my-namespace get secret metrics-access -ojsonpath='{ .data.token }' | base64 -dThis token will be required on the next step to set up Grafana datasource.
-
Send metrics using Prometheus Remote-Write V1 or V2 messages:
URL:https://observability.%publicDomainTemplate%/api/v1/write. publicDomainTemplate details.HTTP Headers:Header: AuthorizationValue: Bearer <TOKEN_VALUE>, with token obtained frommetrics-accessSecretin the previous step.
How to provide external read access to cluster metrics
To provide external access to the cluster metrics, follow these steps:
-
Enable external access to metrics. To do that, enable the
spec.settings.externalMetricsAccessparameter in theobservabilitymodule settings. -
Create a ServiceAccount for request authorization:
apiVersion: v1 kind: ServiceAccount metadata: name: cluster-metrics-access --- apiVersion: v1 kind: Secret metadata: name: cluster-metrics-access annotations: kubernetes.io/service-account.name: cluster-metrics-access type: kubernetes.io/service-account-token -
Grant read access to metrics to the created ServiceAccount using ClusterRole and ClusterRoleBinding resources:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: observability-cluster-metrics-viewer rules: - apiGroups: ["observability.deckhouse.io"] resources: ["clustermetrics"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: bind-observability-cluster-metrics-viewer subjects: - kind: ServiceAccount name: cluster-metrics-access namespace: default roleRef: kind: ClusterRole name: observability-cluster-metrics-viewer apiGroup: rbac.authorization.k8s.io -
Retrieve an authorization token for accessing metrics. When the ServiceAccount was created, a Secret containing its token was also generated. This token is stored as a Base64-encoded value. To extract and decode it, run the following command:
d8 k -n my-namespace get secret cluster-metrics-access -ojsonpath='{ .data.token }' | base64 -dYou will need this token in the next step when configuring the Grafana data source.
-
Configure access to metrics in an external Grafana instance. Add a Prometheus data source with the following parameters:
Name: Any arbitrary name for the data source.URL: External metrics endpoint in thehttps://observability.%publicDomainTemplate%/<prefix>, where:%publicDomainTemplate%: Domain template of your cluster, defined in the global settings of Deckhouse Kubernetes Platform.<prefix>: One of the following Prometheus prefixes:/metrics/main: For the primary Prometheus instance./metrics/longterm: For the Prometheus Longterm instance.
HTTP Headers: Additional HTTP headers for authorization:Header:AuthorizationValue:Bearer <TOKEN_VALUE>, where <TOKEN_VALUE> is the token obtained from the Secret in the previous step.
What are DeadMansSwitch and PrometheusUnavailable alerts?
DeadMansSwitch
DeadMansSwitch is a heartbeat alert that fires continuously while Prometheus is healthy and the alerting pipeline is functional. It is sent to all configured notification channels by default (unless filtered out by label matchers in notification policies).
The DeadMansSwitch alert is hidden from kubectl get clusterobservabilityalerts (list/watch) output to avoid cluttering the alerts list. However, it can still be retrieved directly via kubectl get clusterobservabilityalert <name>.
DeadMansSwitch can be disabled via the deadMansSwitch.enabled setting in the observability ModuleConfig. When disabled, no DeadMansSwitch or PrometheusUnavailable alerts are generated.
PrometheusUnavailable
PrometheusUnavailable (formerly MissingDeadMansSwitch) is an alert that automatically generated when the DeadMansSwitch heartbeat is not received for more than 2 minutes. This indicates that the entire alerting pipeline is not functional — Prometheus may be down, the connection between Prometheus and alertmanager may be broken, or another issue is preventing alerts from being delivered.
PrometheusUnavailable is a cluster-scoped alert and is visible in both the UI and kubectl get clusterobservabilityalerts.