Available in editions: CE, BE, SE, SE+, EE
The extended-monitoring
module extends cluster monitoring capabilities with additional Prometheus exporters, which allow you to identify potential problems before they affect the operation of services.
Module features:
- Advanced Metrics Collection — collects additional metrics, and also includes ready-made alerts and dashboards that allow you to detect and diagnose incidents faster:
- collects and expounds metrics for free space and inodes on nodes, as well as for objects with a label
extended-monitoring.deckhouse.io/enabled =""
in the namespace; - automatically generates alerts when the thresholds are reached.
- collects and expounds metrics for free space and inodes on nodes, as well as for objects with a label
- Container image monitoring:
- adds metrics and sends alerts about unavailability of container images to registry for all types of workload (
Deployments
,StatefulSets
,DaemonSets
,CronJobs
); - helps to find out in advance about possible problems with launching or updating pods.
- adds metrics and sends alerts about unavailability of container images to registry for all types of workload (
- Cluster Events — collects Kubernetes events and displays them as metrics, which allows you to track the dynamics of changes and respond faster to incidents:
- Certificate control:
- scans the cluster’s Secrets and generates metrics about the expiration of x509 certificates;
- allows you not to miss critical moments and update certificates on time, avoiding application downtime due to expired certificates.