Vertical pod autoscaling

How Vertical Pod Autoscaler (VPA) works

The Vertical Pod Autoscaler (VPA) automates container resource management and can significantly improve application performance. VPA is especially useful when it’s difficult to estimate resource needs in advance. When VPA is used in an appropriate operating mode, it sets requested resources based on actual usage data collected from the monitoring system. You can also configure it to only provide recommendations without applying changes automatically.

How VPA interacts with limits

VPA manages the container’s resource requests, but it does not manage limits unless explicitly configured to do so.

VPA calculates recommended values based on resource usage by the container. This behavior can affect the ratio between requests and limits:

If requests and limits are equal, VPA will only update the requests, leaving limits unchanged.
If limits are not specified, VPA will only update requests.
If limits are set but not controlled by VPA, the ratio between requests and limits may shift.

Example 1. In the cluster, we have:

A VPA object:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: test2
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: test2
  updatePolicy:
    updateMode: "Initial"

A pod with specified resources:
```
resources:
limits:
  cpu: 2
requests:
  cpu: 1
```
If the container consumes 1 CPU, VPA will recommend 1.168 CPU. In this case, the ratio between requests and limits is 100% (since the request is managed, but the limit remains unchanged). When the pod is recreated, VPA will update the resources as follows:
```
resources:
limits:
  cpu: 2336m
requests:
  cpu: 1168m
```

Example 2. The cluster contains:

A VPA object:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: test2
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: test2
  updatePolicy:
    updateMode: "Initial"

A pod with specified resources:

resources:
limits:
  cpu: 1
requests:
  cpu: 750m

In this case, the ratio between requests and limits will be 25%. If VPA recommends 1.168 CPU, the container’s resources will be adjusted to:

  resources:
    limits:
      cpu: 1557m
    requests:
      cpu: 1168m

If resources are not limited, VPA might assign excessively high resource values, which can cause issues.

To prevent this, you can:

Use the maxAllowed parameter in the VPA specification:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: hamster
      minAllowed:
        memory: 100Mi
        cpu: 120m
      maxAllowed:
        memory: 300Mi
        cpu: 350m
      mode: Auto

Configure a LimitRange in the cluster:

apiVersion: v1
kind: LimitRange
metadata:
  name: my-app-limits
spec:
  limits:
  - default:
      cpu: 2
      memory: 4Gi
    defaultRequest:
      cpu: 500m
      memory: 256Mi
    type: Container

Monitoring VPA in Grafana

To efficiently manage resources using the Vertical Pod Autoscaler (VPA), it is recommended to use Grafana dashboards. These dashboards allow you to track the current status of VPA, its configuration, and the percentage of pods on which it is active.

Grafana provides several levels of detail for VPA-related information. Key dashboards include:

Main / Namespace — Displays general VPA usage per namespace.
Main / Namespace / Controller — Provides VPA metrics for specific controllers.
Main / Namespace / Controller / Pod — The most granular view, showing data for each individual pod.

Key columns to monitor:

VPA type — Shows the current value of updatePolicy.updateMode, which defines the VPA operating mode. This field appears in the following dashboards:
- Main / Namespace
- Main / Namespace / Controller
- Main / Namespace / Controller / Pod
VPA % (Percentage of pods with VPA enabled) — Displays the percentage of pods within a namespace that have VPA enabled. This helps quickly assess how much of the cluster is covered by automatic resource scaling via VPA.

VPA configuration examples

Example of a minimal VerticalPodAutoscaler resource:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: StatefulSet
    name: my-app

Example of a full VerticalPodAutoscaler resource:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: hamster
      minAllowed:
        memory: 100Mi
        cpu: 120m
      maxAllowed:
        memory: 300Mi
        cpu: 350m
      mode: Auto

How Vertical Pod Autoscaler (VPA) works

How VPA interacts with limits

Monitoring VPA in Grafana

VPA configuration examples

An error has occurred

Tell us what you didn’t like.

Vertical pod autoscaling

How Vertical Pod Autoscaler (VPA) works

How VPA interacts with limits

Monitoring VPA in Grafana

VPA configuration examples

An error has occurred

Tell us what you didn’t like.

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error