How Vertical Pod Autoscaler (VPA) works
The Vertical Pod Autoscaler (VPA) automates container resource management and can significantly improve application performance. VPA is especially useful when it’s difficult to estimate resource needs in advance. When VPA is used in an appropriate operating mode, it sets requested resources based on actual usage data collected from the monitoring system. You can also configure it to only provide recommendations without applying changes automatically.
How VPA interacts with limits
VPA manages the container’s resource requests, but it does not manage limits unless explicitly configured to do so.
VPA calculates recommended values based on resource usage by the container. This behavior can affect the ratio between requests and limits:
- If requests and limits are equal, VPA will only update the requests, leaving limits unchanged.
- If limits are not specified, VPA will only update requests.
- If limits are set but not controlled by VPA, the ratio between requests and limits may shift.
-
Example 1. In the cluster, we have:
-
A VPA object:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: test2 spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: test2 updatePolicy: updateMode: "Initial"
-
A pod with specified resources:
resources: limits: cpu: 2 requests: cpu: 1
If the container consumes 1 CPU, VPA will recommend 1.168 CPU. In this case, the ratio between requests and limits is 100% (since the request is managed, but the limit remains unchanged). When the pod is recreated, VPA will update the resources as follows:
resources: limits: cpu: 2336m requests: cpu: 1168m
-
-
Example 2. The cluster contains:
-
A VPA object:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: test2 spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: test2 updatePolicy: updateMode: "Initial"
-
A pod with specified resources:
resources: limits: cpu: 1 requests: cpu: 750m
In this case, the ratio between requests and limits will be 25%. If VPA recommends 1.168 CPU, the container’s resources will be adjusted to:
resources: limits: cpu: 1557m requests: cpu: 1168m
-
If resources are not limited, VPA might assign excessively high resource values, which can cause issues.
To prevent this, you can:
-
Use the
maxAllowed
parameter in the VPA specification:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-app-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-app updatePolicy: updateMode: "Auto" resourcePolicy: containerPolicies: - containerName: hamster minAllowed: memory: 100Mi cpu: 120m maxAllowed: memory: 300Mi cpu: 350m mode: Auto
-
Configure a
LimitRange
in the cluster:apiVersion: v1 kind: LimitRange metadata: name: my-app-limits spec: limits: - default: cpu: 2 memory: 4Gi defaultRequest: cpu: 500m memory: 256Mi type: Container
Monitoring VPA in Grafana
To efficiently manage resources using the Vertical Pod Autoscaler (VPA), it is recommended to use Grafana dashboards. These dashboards allow you to track the current status of VPA, its configuration, and the percentage of pods on which it is active.
Grafana provides several levels of detail for VPA-related information. Key dashboards include:
- Main / Namespace — Displays general VPA usage per namespace.
- Main / Namespace / Controller — Provides VPA metrics for specific controllers.
- Main / Namespace / Controller / Pod — The most granular view, showing data for each individual pod.
Key columns to monitor:
- VPA type — Shows the current value of
updatePolicy.updateMode
, which defines the VPA operating mode. This field appears in the following dashboards:- Main / Namespace
- Main / Namespace / Controller
- Main / Namespace / Controller / Pod
- VPA % (Percentage of pods with VPA enabled) — Displays the percentage of pods within a namespace that have VPA enabled. This helps quickly assess how much of the cluster is covered by automatic resource scaling via VPA.
VPA configuration examples
-
Example of a minimal
VerticalPodAutoscaler
resource:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-app-vpa spec: targetRef: apiVersion: "apps/v1" kind: StatefulSet name: my-app
-
Example of a full VerticalPodAutoscaler resource:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-app-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-app updatePolicy: updateMode: "Auto" resourcePolicy: containerPolicies: - containerName: hamster minAllowed: memory: 100Mi cpu: 120m maxAllowed: memory: 300Mi cpu: 350m mode: Auto