Available in editions: CE, BE, SE, SE+, EE
The module does not require any configuration – it works right out-of-the-box.
The module has 25 alerts.
The module is enabled by default in the following bundles: Default, Managed.
The module is disabled by default in the Minimal bundle.
Conversions
The module is configured using the ModuleConfig resource, the schema of which contains a version number. When you apply an old version of the ModuleConfig schema in a cluster, automatic transformations are performed. To manually update the ModuleConfig schema version, the following steps must be completed sequentially for each version :
- Updates from version 1 to 2:
Remove the
.auth.passwordfield. If the.authobject becomes empty after this change, delete it.
Settings
The module is configured using the ModuleConfig custom resource named prometheus (learn more about setting up Deckhouse…).
Example of the ModuleConfig/prometheus resource for configuring the module:
apiVersion: deckhouse.io/v1alpha1
kind: ModuleConfig
metadata:
name: prometheus
spec:
version: 2
enabled: true
settings: # <-- Module parameters from the "Parameters" section below.
Parameters
Schema version: 2
- objectsettings
- objectsettings.auth
Options related to authentication or authorization in the application.
- array of stringssettings.auth.allowedUserEmails
An array of emails of users that can access module’s public web interfaces.
This parameter is used if the
user-authnmodule is enabled or theexternalAuthenticationparameter is set. - array of stringssettings.auth.allowedUserGroups
An array of user groups that can access Grafana & Prometheus.
This parameter is used if the
user-authnmodule is enabled or theexternalAuthenticationparameter is set.Caution! Note that you must add those groups to the appropriate field in the DexProvider config if this module is used together with the user-authn one.
- objectsettings.auth.externalAuthentication
Parameters to enable external authentication based on the NGINX Ingress external-auth mechanism that uses the Nginx auth_request module.
External authentication is enabled automatically if the user-authn module is enabled.
- stringsettings.auth.externalAuthentication.authSignInURL
The URL to redirect the user for authentication (if the authentication service returned a non-200 HTTP response code).
Example:
authSignInURL: https://example.com/dex/sign_in - stringsettings.auth.externalAuthentication.authURL
The URL of the authentication service. If the user is authenticated, the service should return an HTTP 200 response code.
Example:
authURL: https://example.com/dex/auth
- booleansettings.auth.satisfyAny
Enables single authentication.
If used together with the whitelistSourceRanges parameter, it authorizes all the users from above networks (no need to enter a username and password).
Default:
falseExample:
satisfyAny: true - array of stringssettings.auth.whitelistSourceRanges
An array if CIDRs that are allowed to authenticate in Grafana & Prometheus.
Example:
whitelistSourceRanges: - 1.1.1.1/32
- objectsettings.externalLabels
The set of external labels to add to the metrics.
It’s possible to expand the environment variables of the
config-reloadercontainer in external labels such as:HOSTNAME/POD_NAME- contains the name of the pod (for exampleprometheus-main-0,prometheus-main-1, etc.).SHARD- contains the shard number.
Example:
externalLabels: prometheus_replica: "$(POD_NAME)" shard: "$(SHARD)" hostname: "$(HOSTNAME)" - objectsettings.grafana
Grafana installation-related settings.
- array of stringssettings.grafana.customPlugins
A list of custom Grafana plugins. Contains plugin names from the official repository.
Here is how you can add custom plugins (in this case, clickhouse-datasource and flowcharting-panel plugins are used):
grafana: customPlugins: - agenty-flowcharting-panel - vertamedia-clickhouse-datasourceYou can also install plugins from other sources by passing a link to the plugin zip archive in the format
<url to plugin zip>;<plugin name>:grafana: customPlugins: - http://10.241.32.16:3000/netsage-bumpchart-panel-1.1.1.zip;netsage-bumpchart-panelExample:
customPlugins: - agenty-flowcharting-panel - vertamedia-clickhouse-datasource - booleansettings.grafana.enabled
Enables Grafana deploy in the cluster.
Default:
trueExample:
enabled: false - booleansettings.grafana.useDarkTheme
The dark theme is enabled by default.
Default:
falseExample:
useDarkTheme: true
- booleansettings.highAvailability
Manually enable the high availability mode.
By default, Deckhouse automatically decides whether to enable the HA mode. Click here to learn more about the HA mode for modules.
Example:
highAvailability: true - objectsettings.https
What certificate type to use with Grafana/Prometheus.
This parameter completely overrides the
global.modules.httpssettings.Examples:
https: mode: CustomCertificate customCertificate: secretName: foobarhttps: mode: CertManager certManager: clusterIssuerName: letsencrypt- objectsettings.https.certManager
- stringsettings.https.certManager.clusterIssuerName
What ClusterIssuer to use for Grafana/Prometheus.
Currently,
letsencrypt,letsencrypt-staging,selfsignedare available. Also, you can define your own.Default:
letsencrypt
- objectsettings.https.customCertificate
- stringsettings.https.customCertificate.secretName
The name of the secret in the
d8-systemnamespace to use with Grafana/Prometheus.This secret must have the kubernetes.io/tls format.
Default:
false
- stringsettings.https.mode
The HTTPS usage mode:
Disabled— Grafana/Prometheus will work over HTTP only;CertManager— Grafana/Prometheus will use HTTPS and get a certificate from the clusterissuer defined in thecertManager.clusterIssuerNameparameter.CustomCertificate— Grafana/Prometheus will use HTTPS using the certificate from thed8-systemnamespace.OnlyInURI— Grafana/Prometheus will work over HTTP (thinking that there is an external HTTPS load balancer in front that terminates HTTPS traffic). All the links in theuser-authnwill be generated using the HTTPS scheme. Load balancer should provide a redirect from HTTP to HTTPS.
Default:
DisabledAllowed values:
Disabled,CertManager,CustomCertificate,OnlyInURI
- stringsettings.ingressClass
The class of the Ingress controller used for Grafana/Prometheus.
An optional parameter. By default, the
modules.ingressClassglobal value is used.Pattern:
^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$Example:
ingressClass: public - integersettings.longtermMaxDiskSizeGigabytesDeprecated
Deprecated and will be removed. Doesn’t affect anything.
- objectsettings.longtermNodeSelector
The same as in the Pods’
spec.nodeSelectorparameter in Kubernetes.If the parameter is omitted or
false, it will be determined automatically.Example:
longtermNodeSelector: disktype: ssd - stringsettings.longtermPodAntiAffinity
Defines the podAntiAffinity configuration for the Prometheus longterm instance relative to the Prometheus main instance.
Preferred— allows scheduling Prometheus longterm instance alongside the Prometheus main instance if it is not possible to place them on different nodes.Required— does not allow scheduling Prometheus longterm instance on the same node as the Prometheus main instance.
Default:
PreferredAllowed values:
Preferred,Required - integersettings.longtermRetentionDays
How long to keep the data in longterm Prometheus.
Setting this parameter to
0will result in Longterm Prometheus not running in the cluster.Default:
1095 - stringsettings.longtermScrapeInterval
Sets the interval for making “data snapshots” of the main Prometheus by the longterm Prometheus.
Default:
5m - stringsettings.longtermStorageClass
The name of the StorageClass to use for Longterm Prometheus.
If omitted, the StorageClass of the existing Longterm Prometheus PVC is used. If there is no PVC yet, the StorageClass will be used according to the global storageClass parameter setting.
The global
storageClassparameter is only considered when the module is enabled. Changing the globalstorageClassparameter while the module is enabled will not trigger disk re-provisioning.Warning. Specifying a value different from the one currently used (in the existing PVC) will result in disk re-provisioning and all data will be deleted.
Warning. When migrating Prometheus with local storage to other nodes, the pod will hang in the Pending state. In this case, it will be necessary to save the Prometheus database, delete the old PVC and restart the pod manually. Local storage refers to a StorageClass that is associated not with network storage, but with a local volume on a node (for example, StorageClass created by the local-path-provider module).
If
falseis specified, emptyDir will be forced to be used.Example:
longtermStorageClass: ceph-ssd - array of objectssettings.longtermTolerations
The same as in the Pods’
spec.tolerationsparameter in Kubernetes;If the parameter is omitted or
false, it will be determined automatically.Example:
longtermTolerations: - key: key1 operator: Equal value: value1 effect: NoSchedule- stringsettings.longtermTolerations.effect
- stringsettings.longtermTolerations.key
- stringsettings.longtermTolerations.operator
- integersettings.longtermTolerations.tolerationSeconds
- stringsettings.longtermTolerations.value
- integersettings.mainMaxDiskSizeGigabytesDeprecated
Deprecated and will be removed. Doesn’t affect anything.
- objectsettings.nodeSelector
The same as in the Pods’
spec.nodeSelectorparameter in Kubernetes.If the parameter is omitted or
false, it will be determined automatically.Example:
nodeSelector: disktype: ssd - integersettings.retentionDays
How long to keep the data.
Default:
15 - stringsettings.scrapeInterval
Sets the interval for scraping metrics from targets.
Evaluation Interval is always equal to scrapeInterval.
Default:
30sPattern:
^([\d]*y)?([\d]*w)?([\d]*d)?([\d]*h)?([\d]*m)?([\d]*s)?$ - stringsettings.storageClass
The name of the StorageClass to use for Longterm Prometheus.
If omitted, the StorageClass of the existing Prometheus PVC is used. If there is no PVC yet, the StorageClass will be used according to the global storageClass parameter setting.
The global
storageClassparameter is only considered when the module is enabled. Changing the globalstorageClassparameter while the module is enabled will not trigger disk re-provisioning.Warning. Specifying a value different from the one currently used (in the existing PVC) will result in disk re-provisioning and all data will be deleted.
If
falseis specified,emptyDirwill be forced to be used.Examples:
storageClass: ceph-ssdstorageClass: 'false' - array of objectssettings.tolerations
The same as in the Pods’
spec.tolerationsparameter in Kubernetes;If the parameter is omitted or
false, it will be determined automatically.Example:
tolerations: - key: key1 operator: Equal value: value1 effect: NoSchedule- stringsettings.tolerations.effect
- stringsettings.tolerations.key
- stringsettings.tolerations.operator
- integersettings.tolerations.tolerationSeconds
- stringsettings.tolerations.value
- objectsettings.vpa
VPA settings for pods.
Default:
{"updateMode":"Initial"}Examples:
vpa: updateMode: Initial longtermMaxCPU: '1' longtermMaxMemory: 1500Mi maxCPU: 1000m maxMemory: 1500Mivpa: updateMode: 'Off'- settings.vpa.longtermMaxCPU
The maximum value that the VPA can set for the Longterm Prometheus Pods.
The default value is chosen automatically based on the maximum number of Pods that can be created in the cluster considering the current number of nodes and their settings. For more information, see the
detect_vpa_maxhook of the module.Example:
longtermMaxCPU: 0.1 - settings.vpa.longtermMaxMemory
The maximum memory requests the VPA can set for the longterm Prometheus Pods.
The default value is chosen automatically based on the maximum number of Pods that can be created in the cluster considering the current number of nodes and their settings. For more information, see the
detect_vpa_maxhook of the module.Example:
longtermMaxMemory: 4Mi - settings.vpa.maxCPU
The maximum value that the VPA can set for the CPU requests for the main Prometheus Pods.
The default value is chosen automatically based on the maximum number of Pods that can be created in the cluster considering the current number of nodes and their settings. For more information, see the
detect_vpa_maxhook of the module.Example:
maxCPU: '3' - settings.vpa.maxMemory
The maximum memory requests the VPA can set for the main Prometheus Pods.
The default value is chosen automatically based on the maximum number of Pods that can be created in the cluster considering the current number of nodes and their settings. For more information, see the
detect_vpa_maxhook of the module.Example:
maxMemory: 3Mi - stringsettings.vpa.updateMode
The VPA usage mode.
Default:
InitialAllowed values:
Initial,Auto,Off
Authentication
user-authn module provides authentication by default. Also, externalAuthentication can be configured (see below).
If these options are disabled, the module will use basic auth with the auto-generated password and the user admin.
Use kubectl to see password:
kubectl -n d8-system exec svc/deckhouse-leader -c deckhouse -- deckhouse-controller module values prometheus -o json | jq '.internal.auth.password'
Delete the Secret to re-generate password:
kubectl -n d8-monitoring delete secret/basic-auth
Note! The
auth.passwordparameter is deprecated.
Notes
retentionSizefor themainandlongtermPrometheus is calculated automatically; you cannot set this value manually!- The following calculation algorithm is used:
pvc_size * 0.85— if the PVC exists;10 GiB— if there is no PVC and if the StorageClass supports resizing;25 GiB— if there is no PVC and if the StorageClass does not support resizing;
- If the
local-storageis used, and you have to change theretentionSize, then you need to manually change the size of the PV and PVC. Caution! Note that the value from.status.capacity.storagePVC is used for the calculation since it reflects the actual size of the PV in the case of manual resizing.
- The following calculation algorithm is used:
40 GiB— size of PersistentVolumeClaim created by default.- You can change the size of Prometheus disks in the standard Kubernetes way (if the StorageClass permits this) by editing the
.spec.resources.requests.storagefield of the PersistentVolumeClaim resource.