Monitoring
For node groups (NodeGroup resource), DKP exports availability metrics for the group.
What information does Prometheus collect?
All node group metrics have the prefix d8_node_group_
in their name, and a label with the node group’s name node_group_name
.
The following metrics are collected for each node group:
d8_node_group_ready
— the number of nodes in the group that are inReady
status;d8_node_group_nodes
— the total number of nodes in the group (in any status);d8_node_group_instances
— the total number of instances in the group (in any status);d8_node_group_desired
— the desired (target) number ofMachines
objects in the group;d8_node_group_min
— the minimum number of instances in the group;d8_node_group_max
— the maximum number of instances in the group;d8_node_group_up_to_date
— the number of nodes in the group inup-to-date
state;d8_node_group_standby
— the number of standby nodes in the group (see the standby parameter);d8_node_group_has_errors
— one if there are any errors in the node group.