Kubernetes Autoscaling Complete Guide (Part 5): Vertical Pod Autoscaler & Resource Optimization

November 9, 2025 • 35 min read • Yennj12 team

all engineering devops kubernetes optimization

Kubernetes K8S VPA Vertical Pod Autoscaler Resource Optimization Cost Optimization Right-sizing Performance FinOps

Series Overview

This is Part 5 of the Kubernetes Autoscaling Complete Guide series:

Part 1: Horizontal Pod Autoscaler - Application-level autoscaling theory
Part 2: Cluster Autoscaling & Cloud Providers - Infrastructure-level autoscaling
Part 3: Hands-On HPA Demo - Practical implementation with Apache-PHP
Part 4: Monitoring, Alerting & Threshold Tuning - Production observability
Part 5 (This Post): VPA & Resource Optimization - Right-sizing and cost optimization

While Horizontal Pod Autoscaler (HPA) scales the number of pod replicas, Vertical Pod Autoscaler (VPA) optimizes resource requests and limits for individual pods. This guide explores VPA architecture, implementation strategies, safe combination with HPA, and comprehensive resource optimization techniques.

The Resource Management Challenge

The Cost of Misconfigured Resources

OVER-PROVISIONED SCENARIO:
┌─────────────────────────────────────────────────────────────┐
│  Pod Resource Configuration                                 │
│                                                              │
│  Requested: 2 CPU, 4GB RAM                                  │
│  Actual Usage: 0.3 CPU (15%), 800MB RAM (20%)              │
│                                                              │
│  Waste: 1.7 CPU (85%), 3.2GB RAM (80%)                     │
│  Monthly Cost: $120                                          │
│  Wasted Cost: $102/month per pod                            │
│                                                              │
│  With 100 pods: $10,200/month wasted                        │
└─────────────────────────────────────────────────────────────┘

UNDER-PROVISIONED SCENARIO:
┌─────────────────────────────────────────────────────────────┐
│  Pod Resource Configuration                                 │
│                                                              │
│  Requested: 0.5 CPU, 512MB RAM                              │
│  Actual Usage: 0.8 CPU (160%), 1.2GB RAM (240%)            │
│                                                              │
│  Problems:                                                   │
│  • CPU throttling → slow response times                     │
│  • OOMKilled → pod restarts                                 │
│  • Service degradation                                       │
│  • Customer impact → lost revenue                           │
└─────────────────────────────────────────────────────────────┘

VPA OPTIMIZED:
┌─────────────────────────────────────────────────────────────┐
│  Pod Resource Configuration                                 │
│                                                              │
│  Requested: 0.4 CPU, 1GB RAM                                │
│  Actual Usage: 0.35 CPU (87%), 900MB RAM (90%)             │
│                                                              │
│  Result:                                                     │
│  • 80% cost savings vs over-provisioned                     │
│  • No throttling or OOM issues                              │
│  • Optimal resource utilization                             │
└─────────────────────────────────────────────────────────────┘

Business Impact

Metric	Without VPA	With VPA	Impact
Resource Waste	40-70% typical	5-15%	60%+ cost reduction
OOMKilled Events	Common	Rare	Better reliability
CPU Throttling	Frequent	Minimal	Improved performance
Manual Tuning Time	Hours/week	Automated	Operational efficiency
Right-sizing Accuracy	Guesswork	Data-driven	Precision optimization

Understanding Vertical Pod Autoscaler

VPA Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                        VPA ARCHITECTURE                              │
│                                                                      │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │                   VPA ADMISSION CONTROLLER                     │ │
│  │                                                                 │ │
│  │  • Intercepts pod creation requests                            │ │
│  │  • Injects resource requests/limits                            │ │
│  │  • Works at pod admission time                                 │ │
│  └────────────────┬───────────────────────────────────────────────┘ │
│                   │                                                  │
│                   ↓                                                  │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │                   VPA RECOMMENDER                              │ │
│  │                                                                 │ │
│  │  • Monitors pod resource usage (from Metrics Server)           │ │
│  │  • Analyzes historical metrics                                 │ │
│  │  • Calculates optimal resource requests                        │ │
│  │  • Stores recommendations in VPA objects                       │ │
│  └────────────────┬───────────────────────────────────────────────┘ │
│                   │                                                  │
│                   ↓                                                  │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │                   VPA UPDATER                                  │ │
│  │                                                                 │ │
│  │  • Checks if pods need resource updates                        │ │
│  │  • Evicts pods with outdated resource configs                  │ │
│  │  • Triggers pod recreation with new resources                  │ │
│  │  • Respects PodDisruptionBudgets                              │ │
│  └────────────────┬───────────────────────────────────────────────┘ │
│                   │                                                  │
│                   ↓                                                  │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │              KUBERNETES API & METRICS                          │ │
│  │                                                                 │ │
│  │  Metrics Server → VPA Recommender → VPA Object → Updater       │ │
│  └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘

VPA vs HPA Comparison

Aspect	VPA	HPA
Scaling Direction	Vertical (resources per pod)	Horizontal (number of pods)
What it Changes	CPU/memory requests & limits	Replica count
Pod Disruption	Yes (recreation required)	No (gradual)
Best For	Right-sizing, cost optimization	Traffic scaling, load handling
Stateful Apps	Suitable	Complex
Response Time	Minutes (pod restart)	Seconds to minutes
Use Case	Unknown resource needs	Known scaling patterns
Combine with Other	Can combine with HPA (carefully)	Can combine with VPA

Part 1: Installing VPA

Prerequisites

1# Ensure Metrics Server is installed
2kubectl get deployment metrics-server -n kube-system
3
4# Verify metrics are available
5kubectl top nodes
6kubectl top pods -A

Installation via Manifests

 1# Clone VPA repository
 2git clone https://github.com/kubernetes/autoscaler.git
 3cd autoscaler/vertical-pod-autoscaler
 4
 5# Install VPA components
 6./hack/vpa-up.sh
 7
 8# Verify installation
 9kubectl get pods -n kube-system | grep vpa
10
11# Expected output:
12# vpa-admission-controller-xxx   1/1     Running   0          2m
13# vpa-recommender-xxx            1/1     Running   0          2m
14# vpa-updater-xxx                1/1     Running   0          2m
15
16# Verify CRDs
17kubectl get crd | grep verticalpodautoscaler
18
19# Expected:
20# verticalpodautoscalercheckpoints.autoscaling.k8s.io
21# verticalpodautoscalers.autoscaling.k8s.io

Installation via Helm

 1# Add VPA Helm repository
 2helm repo add fairwinds-stable https://charts.fairwinds.com/stable
 3helm repo update
 4
 5# Install VPA
 6helm install vpa fairwinds-stable/vpa \
 7  --namespace kube-system \
 8  --set recommender.enabled=true \
 9  --set updater.enabled=true \
10  --set admissionController.enabled=true
11
12# Verify installation
13helm status vpa -n kube-system
14kubectl get pods -n kube-system -l app.kubernetes.io/name=vpa

Installation via AWS CDK (EKS Integration)

Add to your CDK stack from Part 3:

 1import * as cdk from 'aws-cdk-lib';
 2import * as eks from 'aws-cdk-lib/aws-eks';
 3import { Construct } from 'constructs';
 4
 5export class EksVpaStack extends cdk.Stack {
 6  constructor(scope: Construct, id: string, cluster: eks.Cluster, props?: cdk.StackProps) {
 7    super(scope, id, props);
 8
 9    // Install VPA using Helm
10    const vpa = cluster.addHelmChart('VPA', {
11      chart: 'vpa',
12      repository: 'https://charts.fairwinds.com/stable',
13      namespace: 'kube-system',
14      release: 'vpa',
15      version: '4.4.6', // Check for latest version
16
17      values: {
18        // Recommender configuration
19        recommender: {
20          enabled: true,
21          extraArgs: {
22            'v': '4', // Verbose logging
23            'pod-recommendation-min-cpu-millicores': '25', // Minimum CPU recommendation
24            'pod-recommendation-min-memory-mb': '100', // Minimum memory recommendation
25            'recommendation-margin-fraction': '0.15', // 15% safety margin
26            'storage': 'prometheus', // Optional: Use Prometheus for history
27          },
28          resources: {
29            requests: {
30              cpu: '200m',
31              memory: '512Mi',
32            },
33            limits: {
34              cpu: '500m',
35              memory: '1Gi',
36            },
37          },
38        },
39
40        // Updater configuration
41        updater: {
42          enabled: true,
43          extraArgs: {
44            'min-replicas': '2', // Only update deployments with 2+ replicas
45            'eviction-tolerance': '0.5', // Max 50% pods can be evicting
46          },
47          resources: {
48            requests: {
49              cpu: '100m',
50              memory: '256Mi',
51            },
52            limits: {
53              cpu: '200m',
54              memory: '512Mi',
55            },
56          },
57        },
58
59        // Admission Controller configuration
60        admissionController: {
61          enabled: true,
62          generateCertificate: true,
63          resources: {
64            requests: {
65              cpu: '100m',
66              memory: '256Mi',
67            },
68            limits: {
69              cpu: '200m',
70              memory: '512Mi',
71            },
72          },
73        },
74
75        // Metrics Server dependency
76        metrics: {
77          enabled: false, // Assuming already installed
78        },
79      },
80    });
81
82    // Output VPA status check command
83    new cdk.CfnOutput(this, 'VPAStatusCommand', {
84      value: 'kubectl get pods -n kube-system -l app.kubernetes.io/name=vpa',
85      description: 'Command to check VPA pods status',
86    });
87  }
88}

Part 2: VPA Update Modes

VPA supports four update modes that control how it applies recommendations:

Mode 1: Off (Recommendation Only)

Use Case: Testing VPA without impacting workloads

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name: my-app-vpa-off
 5  namespace: default
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Off"  # Only generate recommendations
14
15# VPA will NOT modify pods, only provide recommendations
16# Check recommendations:
17# kubectl describe vpa my-app-vpa-off

Benefits:

Safe exploration of VPA recommendations
No disruption to running workloads
Understand resource usage patterns
Plan resource adjustments

Example Output:

 1kubectl describe vpa my-app-vpa-off
 2
 3# Output shows recommendations:
 4Recommendation:
 5  Container Recommendations:
 6    Container Name: my-app
 7    Lower Bound:
 8      Cpu:     150m
 9      Memory:  256Mi
10    Target:
11      Cpu:     300m      # Recommended request
12      Memory:  512Mi     # Recommended request
13    Uncapped Target:
14      Cpu:     300m
15      Memory:  512Mi
16    Upper Bound:
17      Cpu:     1
18      Memory:  2Gi

Mode 2: Initial (Apply on Pod Creation Only)

Use Case: New deployments, gradual rollout

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name: my-app-vpa-initial
 5  namespace: default
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Initial"  # Apply only when pods are created
14
15  resourcePolicy:
16    containerPolicies:
17    - containerName: my-app
18      minAllowed:
19        cpu: 100m
20        memory: 128Mi
21      maxAllowed:
22        cpu: 2
23        memory: 4Gi

Behavior:

VPA sets resource requests when pods are first created
No changes to existing running pods
Useful for new deployments or scaling events
Safe for production workloads

When to Use:

Initial deployment with unknown resource needs
Canary deployments
Blue/green deployments
When combined with HPA (pods recreated during scale events)

Mode 3: Recreate (Apply by Restarting Pods)

Use Case: Production optimization with controlled disruption

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name: my-app-vpa-recreate
 5  namespace: default
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Recreate"  # VPA will evict and recreate pods
14
15  resourcePolicy:
16    containerPolicies:
17    - containerName: my-app
18      minAllowed:
19        cpu: 100m
20        memory: 128Mi
21      maxAllowed:
22        cpu: 2
23        memory: 4Gi
24      controlledResources: ["cpu", "memory"]
25      mode: Auto  # VPA manages both requests and limits
26
27# PodDisruptionBudget to control eviction rate
28---
29apiVersion: policy/v1
30kind: PodDisruptionBudget
31metadata:
32  name: my-app-pdb
33  namespace: default
34spec:
35  minAvailable: 2  # At least 2 pods must remain available
36  selector:
37    matchLabels:
38      app: my-app

Behavior:

VPA evicts pods with outdated resource configuration
Pods are recreated with new resource requests
Respects PodDisruptionBudgets
Gradual rollout to maintain availability

Important Considerations:

Disruption: Pods will be restarted
Stateful Apps: Handle with care (use PVCs, proper shutdown)
PDBs Required: Prevent cascading failures
Monitoring: Watch for elevated pod restart rates

Mode 4: Auto (Future - Not Yet Implemented)

Status: Planned feature for in-place resource updates

 1# Future capability (not yet available)
 2apiVersion: autoscaling.k8s.io/v1
 3kind: VerticalPodAutoscaler
 4metadata:
 5  name: my-app-vpa-auto
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Auto"  # In-place updates without pod restart
14
15# When available, will update resources WITHOUT pod eviction
16# Requires Kubernetes in-place resource update feature

Expected Behavior (when implemented):

Update pod resources without restart
Zero disruption
Immediate application of new limits

Part 3: VPA Configuration Deep Dive

Basic VPA Configuration

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name: my-app-vpa
 5  namespace: default
 6spec:
 7  # Target workload
 8  targetRef:
 9    apiVersion: apps/v1
10    kind: Deployment  # Can be: Deployment, StatefulSet, DaemonSet, ReplicaSet
11    name: my-app
12
13  # Update policy
14  updatePolicy:
15    updateMode: "Auto"  # Off, Initial, Recreate, Auto
16
17  # Resource policy (constraints and rules)
18  resourcePolicy:
19    containerPolicies:
20    - containerName: '*'  # Apply to all containers, or specify name
21      minAllowed:
22        cpu: 100m
23        memory: 128Mi
24      maxAllowed:
25        cpu: 2
26        memory: 4Gi
27      controlledResources: ["cpu", "memory"]  # What VPA should manage
28
29      # Resource scaling mode
30      mode: Auto  # Auto (manage requests & limits) or Off

Advanced VPA Configuration

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name: advanced-vpa
 5  namespace: production
 6  labels:
 7    app: my-app
 8    environment: production
 9spec:
10  targetRef:
11    apiVersion: apps/v1
12    kind: Deployment
13    name: my-app
14
15  updatePolicy:
16    updateMode: "Recreate"
17
18    # Minimum number of replicas required
19    minReplicas: 2  # Don't update if less than 2 replicas
20
21  resourcePolicy:
22    containerPolicies:
23
24    # Application container
25    - containerName: app
26      minAllowed:
27        cpu: 200m
28        memory: 256Mi
29      maxAllowed:
30        cpu: 4
31        memory: 8Gi
32      controlledResources: ["cpu", "memory"]
33      mode: Auto
34
35      # Resource scaling factors
36      controlledValues: RequestsAndLimits  # or RequestsOnly
37
38    # Sidecar container (different policy)
39    - containerName: sidecar
40      minAllowed:
41        cpu: 50m
42        memory: 64Mi
43      maxAllowed:
44        cpu: 500m
45        memory: 512Mi
46      controlledResources: ["cpu", "memory"]
47      mode: Auto
48
49  # Recommender configuration
50  recommenders:
51  - name: custom-recommender  # Use custom recommender if deployed

Resource Policy Options Explained

controlledResources

1# Option 1: Manage both CPU and memory
2controlledResources: ["cpu", "memory"]
3
4# Option 2: CPU only
5controlledResources: ["cpu"]
6
7# Option 3: Memory only
8controlledResources: ["memory"]

controlledValues

1# Option 1: Manage both requests and limits (default)
2controlledValues: RequestsAndLimits
3# VPA sets both resource requests and limits
4# Limit = Request * current limit/request ratio
5
6# Option 2: Manage requests only
7controlledValues: RequestsOnly
8# VPA only sets resource requests
9# Limits remain as defined in pod spec

Example:

 1# Original pod spec:
 2resources:
 3  requests:
 4    cpu: 100m
 5    memory: 128Mi
 6  limits:
 7    cpu: 500m     # 5x request
 8    memory: 512Mi # 4x request
 9
10# With controlledValues: RequestsAndLimits
11# VPA recommendation: 200m CPU, 256Mi memory
12# VPA sets:
13resources:
14  requests:
15    cpu: 200m
16    memory: 256Mi
17  limits:
18    cpu: 1000m    # 5x request (ratio preserved)
19    memory: 1Gi   # 4x request (ratio preserved)
20
21# With controlledValues: RequestsOnly
22# VPA sets:
23resources:
24  requests:
25    cpu: 200m
26    memory: 256Mi
27  limits:
28    cpu: 500m     # Original limit (unchanged)
29    memory: 512Mi # Original limit (unchanged)

Part 4: Combining VPA with HPA

The Challenge

VPA and HPA can conflict when both try to manage the same workload:

Conflict Scenario:
┌────────────────────────────────────────────────────────────┐
│  Time: 10:00 - High CPU usage detected                    │
│                                                             │
│  HPA: "CPU is high, scale from 3 to 6 pods"               │
│  VPA: "CPU is high, increase CPU requests from 100m to 200m"│
│                                                             │
│  Result: Both scale simultaneously                         │
│  • HPA adds 3 pods with old 100m requests                 │
│  • VPA tries to recreate all 6 pods with 200m requests    │
│  • Cascading pod restarts                                  │
│  • Service disruption                                      │
└────────────────────────────────────────────────────────────┘

Safe Combination Strategies

Strategy 1: VPA for CPU, HPA for Custom Metrics

Recommendation: Most common and safest approach

 1# VPA configuration
 2apiVersion: autoscaling.k8s.io/v1
 3kind: VerticalPodAutoscaler
 4metadata:
 5  name: my-app-vpa
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Initial"  # Only apply on new pods (from HPA scaling)
14
15  resourcePolicy:
16    containerPolicies:
17    - containerName: '*'
18      minAllowed:
19        cpu: 100m
20        memory: 256Mi
21      maxAllowed:
22        cpu: 2
23        memory: 4Gi
24      # KEY: Only manage CPU
25      controlledResources: ["cpu"]
26      controlledValues: RequestsOnly
27
28---
29# HPA configuration
30apiVersion: autoscaling/v2
31kind: HorizontalPodAutoscaler
32metadata:
33  name: my-app-hpa
34spec:
35  scaleTargetRef:
36    apiVersion: apps/v1
37    kind: Deployment
38    name: my-app
39
40  minReplicas: 2
41  maxReplicas: 20
42
43  # KEY: Use custom metrics, NOT CPU
44  metrics:
45  - type: Pods
46    pods:
47      metric:
48        name: http_requests_per_second
49      target:
50        type: AverageValue
51        averageValue: "1000"
52
53  # Or use memory (since VPA manages CPU)
54  - type: Resource
55    resource:
56      name: memory
57      target:
58        type: Utilization
59        averageUtilization: 70

Why This Works:

VPA optimizes CPU requests based on actual usage
HPA scales replicas based on request rate or memory
No conflict: they manage different dimensions

Strategy 2: VPA Off Mode + Manual Right-sizing

 1# VPA in recommendation-only mode
 2apiVersion: autoscaling.k8s.io/v1
 3kind: VerticalPodAutoscaler
 4metadata:
 5  name: my-app-vpa-readonly
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: Deployment
10    name: my-app
11
12  updatePolicy:
13    updateMode: "Off"  # Recommendations only
14
15---
16# HPA manages scaling
17apiVersion: autoscaling/v2
18kind: HorizontalPodAutoscaler
19metadata:
20  name: my-app-hpa
21spec:
22  scaleTargetRef:
23    apiVersion: apps/v1
24    kind: Deployment
25    name: my-app
26
27  minReplicas: 3
28  maxReplicas: 50
29
30  metrics:
31  - type: Resource
32    resource:
33      name: cpu
34      target:
35        type: Utilization
36        averageUtilization: 70

Process:

VPA generates recommendations
Review recommendations weekly/monthly
Manually update deployment resource requests
HPA continues to scale horizontally

Benefits:

Zero conflict
Full control over resource changes
Suitable for conservative environments

Strategy 3: Separate Workloads

Best Practice: Use VPA and HPA on different workloads

 1# VPA for stateful workloads (vertical scaling)
 2apiVersion: autoscaling.k8s.io/v1
 3kind: VerticalPodAutoscaler
 4metadata:
 5  name: database-vpa
 6spec:
 7  targetRef:
 8    apiVersion: apps/v1
 9    kind: StatefulSet
10    name: postgres
11  updatePolicy:
12    updateMode: "Recreate"
13  resourcePolicy:
14    containerPolicies:
15    - containerName: postgres
16      minAllowed:
17        cpu: 1
18        memory: 2Gi
19      maxAllowed:
20        cpu: 8
21        memory: 32Gi
22
23---
24# HPA for stateless workloads (horizontal scaling)
25apiVersion: autoscaling/v2
26kind: HorizontalPodAutoscaler
27metadata:
28  name: api-hpa
29spec:
30  scaleTargetRef:
31    apiVersion: apps/v1
32    kind: Deployment
33    name: api-server
34  minReplicas: 5
35  maxReplicas: 100
36  metrics:
37  - type: Resource
38    resource:
39      name: cpu
40      target:
41        type: Utilization
42        averageUtilization: 70

Configuration Matrix

VPA Mode	HPA Metric	Result	Recommendation
Off	CPU	✅ Safe	VPA provides insights, HPA scales
Initial	Custom (requests/sec)	✅ Safe	VPA right-sizes on scale events
Initial	Memory	✅ Safe	Different resources managed
Recreate	CPU	⚠️ Risky	Can cause thrashing
Recreate	Custom	✅ Safe	VPA updates resources, HPA scales on different metric
Recreate	Memory	⚠️ Moderate	Monitor closely

Part 5: Production VPA Examples

Example 1: Stateless Web Application

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: web-app
 5  namespace: production
 6spec:
 7  replicas: 5
 8  selector:
 9    matchLabels:
10      app: web-app
11  template:
12    metadata:
13      labels:
14        app: web-app
15    spec:
16      containers:
17      - name: nginx
18        image: nginx:1.25
19        resources:
20          requests:
21            cpu: 100m      # Initial guess
22            memory: 128Mi  # Initial guess
23          limits:
24            cpu: 500m
25            memory: 512Mi
26
27---
28apiVersion: autoscaling.k8s.io/v1
29kind: VerticalPodAutoscaler
30metadata:
31  name: web-app-vpa
32  namespace: production
33spec:
34  targetRef:
35    apiVersion: apps/v1
36    kind: Deployment
37    name: web-app
38
39  updatePolicy:
40    updateMode: "Recreate"
41
42  resourcePolicy:
43    containerPolicies:
44    - containerName: nginx
45      minAllowed:
46        cpu: 50m
47        memory: 64Mi
48      maxAllowed:
49        cpu: 1
50        memory: 1Gi
51      controlledResources: ["cpu", "memory"]
52      controlledValues: RequestsAndLimits
53
54---
55# PDB to ensure availability during updates
56apiVersion: policy/v1
57kind: PodDisruptionBudget
58metadata:
59  name: web-app-pdb
60  namespace: production
61spec:
62  minAvailable: 3  # Keep at least 3 pods running
63  selector:
64    matchLabels:
65      app: web-app

Example 2: Stateful Database

 1apiVersion: apps/v1
 2kind: StatefulSet
 3metadata:
 4  name: postgres
 5  namespace: databases
 6spec:
 7  serviceName: postgres
 8  replicas: 3
 9  selector:
10    matchLabels:
11      app: postgres
12  template:
13    metadata:
14      labels:
15        app: postgres
16    spec:
17      containers:
18      - name: postgres
19        image: postgres:15
20        resources:
21          requests:
22            cpu: 2
23            memory: 4Gi
24          limits:
25            cpu: 4
26            memory: 16Gi
27        volumeMounts:
28        - name: data
29          mountPath: /var/lib/postgresql/data
30
31  volumeClaimTemplates:
32  - metadata:
33      name: data
34    spec:
35      accessModes: ["ReadWriteOnce"]
36      resources:
37        requests:
38          storage: 100Gi
39
40---
41apiVersion: autoscaling.k8s.io/v1
42kind: VerticalPodAutoscaler
43metadata:
44  name: postgres-vpa
45  namespace: databases
46spec:
47  targetRef:
48    apiVersion: apps/v1
49    kind: StatefulSet
50    name: postgres
51
52  updatePolicy:
53    updateMode: "Initial"  # Safer for stateful apps
54
55  resourcePolicy:
56    containerPolicies:
57    - containerName: postgres
58      minAllowed:
59        cpu: 1
60        memory: 2Gi
61      maxAllowed:
62        cpu: 8
63        memory: 32Gi
64      controlledResources: ["cpu", "memory"]
65      controlledValues: RequestsOnly  # Keep original limits
66
67---
68apiVersion: policy/v1
69kind: PodDisruptionBudget
70metadata:
71  name: postgres-pdb
72  namespace: databases
73spec:
74  maxUnavailable: 1  # Only 1 pod can be down at a time
75  selector:
76    matchLabels:
77      app: postgres

Example 3: Microservices with Different Profiles

 1# CPU-intensive service
 2---
 3apiVersion: autoscaling.k8s.io/v1
 4kind: VerticalPodAutoscaler
 5metadata:
 6  name: image-processor-vpa
 7  namespace: production
 8spec:
 9  targetRef:
10    apiVersion: apps/v1
11    kind: Deployment
12    name: image-processor
13
14  updatePolicy:
15    updateMode: "Recreate"
16
17  resourcePolicy:
18    containerPolicies:
19    - containerName: processor
20      minAllowed:
21        cpu: 500m      # Higher CPU baseline
22        memory: 256Mi
23      maxAllowed:
24        cpu: 8         # Allow significant CPU growth
25        memory: 2Gi
26      controlledResources: ["cpu", "memory"]
27
28---
29# Memory-intensive service
30apiVersion: autoscaling.k8s.io/v1
31kind: VerticalPodAutoscaler
32metadata:
33  name: cache-service-vpa
34  namespace: production
35spec:
36  targetRef:
37    apiVersion: apps/v1
38    kind: Deployment
39    name: cache-service
40
41  updatePolicy:
42    updateMode: "Recreate"
43
44  resourcePolicy:
45    containerPolicies:
46    - containerName: redis
47      minAllowed:
48        cpu: 100m
49        memory: 1Gi       # Higher memory baseline
50      maxAllowed:
51        cpu: 2
52        memory: 16Gi      # Allow significant memory growth
53      controlledResources: ["cpu", "memory"]
54
55---
56# Balanced service
57apiVersion: autoscaling.k8s.io/v1
58kind: VerticalPodAutoscaler
59metadata:
60  name: api-service-vpa
61  namespace: production
62spec:
63  targetRef:
64    apiVersion: apps/v1
65    kind: Deployment
66    name: api-service
67
68  updatePolicy:
69    updateMode: "Initial"  # Apply on HPA scale events
70
71  resourcePolicy:
72    containerPolicies:
73    - containerName: api
74      minAllowed:
75        cpu: 100m
76        memory: 128Mi
77      maxAllowed:
78        cpu: 2
79        memory: 2Gi
80      controlledResources: ["cpu", "memory"]

Part 6: Resource Optimization Strategies

Strategy 1: Rightsizing Workflow

Phase 1: Discovery (Week 1)

 1# Step 1: Deploy VPA in "Off" mode for all deployments
 2for deployment in $(kubectl get deployments -n production -o name); do
 3  cat <<EOF | kubectl apply -f -
 4apiVersion: autoscaling.k8s.io/v1
 5kind: VerticalPodAutoscaler
 6metadata:
 7  name: $(basename $deployment)-vpa
 8  namespace: production
 9spec:
10  targetRef:
11    apiVersion: apps/v1
12    kind: Deployment
13    name: $(basename $deployment)
14  updatePolicy:
15    updateMode: "Off"
16EOF
17done
18
19# Step 2: Wait for 7 days to collect data
20
21# Step 3: Collect recommendations
22kubectl get vpa -n production -o yaml > vpa-recommendations.yaml
23
24# Step 4: Analyze recommendations
25for vpa in $(kubectl get vpa -n production -o name); do
26  echo "=== $vpa ==="
27  kubectl describe $vpa -n production | grep -A 20 "Target:"
28done

Phase 2: Analysis (Week 2)

 1# Generate resource optimization report
 2cat > analyze-vpa.sh <<'EOF'
 3#!/bin/bash
 4
 5echo "VPA Recommendations Analysis"
 6echo "============================="
 7echo ""
 8
 9for vpa in $(kubectl get vpa -n production -o name); do
10  deployment=$(kubectl get $vpa -n production -o jsonpath='{.spec.targetRef.name}')
11
12  echo "Deployment: $deployment"
13
14  # Current requests
15  current_cpu=$(kubectl get deployment $deployment -n production -o jsonpath='{.spec.template.spec.containers[0].resources.requests.cpu}')
16  current_mem=$(kubectl get deployment $deployment -n production -o jsonpath='{.spec.template.spec.containers[0].resources.requests.memory}')
17
18  # VPA recommendations
19  target_cpu=$(kubectl get $vpa -n production -o jsonpath='{.status.recommendation.containerRecommendations[0].target.cpu}')
20  target_mem=$(kubectl get $vpa -n production -o jsonpath='{.status.recommendation.containerRecommendations[0].target.memory}')
21
22  echo "  Current: CPU=$current_cpu, Memory=$current_mem"
23  echo "  Target:  CPU=$target_cpu, Memory=$target_mem"
24  echo ""
25done
26EOF
27
28chmod +x analyze-vpa.sh
29./analyze-vpa.sh

Phase 3: Implementation (Week 3)

 1# Apply recommendations gradually
 2# Start with non-critical services
 3
 4# 1. Test environment first
 5kubectl patch deployment my-app -n production -p '{
 6  "spec": {
 7    "template": {
 8      "spec": {
 9        "containers": [
10          {
11            "name": "my-app",
12            "resources": {
13              "requests": {
14                "cpu": "300m",
15                "memory": "512Mi"
16              }
17            }
18          }
19        ]
20      }
21    }
22  }
23}'
24
25# 2. Monitor for issues
26kubectl top pods -n production -l app=my-app --watch
27
28# 3. If stable, proceed with production

Strategy 2: Cluster-Wide Optimization

 1# Create VPA for all deployments using a script
 2apiVersion: v1
 3kind: ConfigMap
 4metadata:
 5  name: vpa-automation
 6  namespace: kube-system
 7data:
 8  create-vpas.sh: |
 9    #!/bin/bash
10
11    # Create VPA for all deployments in specific namespaces
12    NAMESPACES="production staging development"
13
14    for ns in $NAMESPACES; do
15      for deployment in $(kubectl get deployments -n $ns -o name); do
16        deployment_name=$(basename $deployment)
17
18        cat <<EOF | kubectl apply -f -
19    apiVersion: autoscaling.k8s.io/v1
20    kind: VerticalPodAutoscaler
21    metadata:
22      name: ${deployment_name}-vpa
23      namespace: $ns
24      labels:
25        managed-by: vpa-automation
26    spec:
27      targetRef:
28        apiVersion: apps/v1
29        kind: Deployment
30        name: ${deployment_name}
31      updatePolicy:
32        updateMode: "Initial"  # Safe default
33      resourcePolicy:
34        containerPolicies:
35        - containerName: '*'
36          minAllowed:
37            cpu: 50m
38            memory: 64Mi
39          maxAllowed:
40            cpu: 4
41            memory: 8Gi
42    EOF
43      done
44    done
45
46    echo "VPA objects created for all deployments"    
47
48---
49# CronJob to run automation weekly
50apiVersion: batch/v1
51kind: CronJob
52metadata:
53  name: vpa-optimizer
54  namespace: kube-system
55spec:
56  schedule: "0 2 * * 0"  # Every Sunday at 2 AM
57  jobTemplate:
58    spec:
59      template:
60        spec:
61          serviceAccountName: vpa-automation
62          containers:
63          - name: optimizer
64            image: bitnami/kubectl:latest
65            command:
66            - /bin/bash
67            - -c
68            - |
69              # Generate cost savings report
70              echo "Weekly VPA Optimization Report"
71              echo "=============================="
72
73              total_savings=0
74
75              for ns in production staging; do
76                echo ""
77                echo "Namespace: $ns"
78                echo "---"
79
80                for vpa in $(kubectl get vpa -n $ns -o name); do
81                  deployment=$(kubectl get $vpa -n $ns -o jsonpath='{.spec.targetRef.name}')
82
83                  # Calculate potential savings
84                  # (This is simplified; real calculation would be more complex)
85
86                  echo "  $deployment: Review recommendations"
87                done
88              done              
89          restartPolicy: OnFailure

Strategy 3: Cost Attribution & Showback

 1# Prometheus rules for cost tracking
 2apiVersion: monitoring.coreos.com/v1
 3kind: PrometheusRule
 4metadata:
 5  name: resource-cost-tracking
 6  namespace: monitoring
 7spec:
 8  groups:
 9  - name: resource-costs
10    interval: 5m
11    rules:
12
13    # CPU cost per namespace
14    - record: namespace:cpu_cost:sum
15      expr: |
16        sum(
17          kube_pod_container_resource_requests{resource="cpu", unit="core"}
18          * 0.04  # $0.04 per CPU hour
19        ) by (namespace)        
20
21    # Memory cost per namespace
22    - record: namespace:memory_cost:sum
23      expr: |
24        sum(
25          kube_pod_container_resource_requests{resource="memory", unit="byte"}
26          / (1024*1024*1024)  # Convert to GB
27          * 0.005  # $0.005 per GB hour
28        ) by (namespace)        
29
30    # Total cost per namespace
31    - record: namespace:total_cost:sum
32      expr: |
33        namespace:cpu_cost:sum + namespace:memory_cost:sum        
34
35    # VPA optimization potential
36    - record: namespace:vpa_savings_potential:sum
37      expr: |
38        sum(
39          kube_pod_container_resource_requests{resource="cpu"}
40          - on(pod, namespace) group_left()
41          kube_verticalpodautoscaler_spec_resourcepolicy_container_policies_target{resource="cpu"}
42        ) by (namespace)
43        * 0.04  # CPU price        
44
45---
46# Grafana dashboard for cost tracking (ConfigMap)
47apiVersion: v1
48kind: ConfigMap
49metadata:
50  name: cost-dashboard
51  namespace: monitoring
52  labels:
53    grafana_dashboard: "1"
54data:
55  cost-dashboard.json: |
56    {
57      "dashboard": {
58        "title": "Kubernetes Cost & VPA Savings",
59        "panels": [
60          {
61            "title": "Monthly Cost by Namespace",
62            "targets": [
63              {
64                "expr": "namespace:total_cost:sum * 730",
65                "legendFormat": "{{ namespace }}"
66              }
67            ]
68          },
69          {
70            "title": "VPA Potential Savings",
71            "targets": [
72              {
73                "expr": "namespace:vpa_savings_potential:sum * 730",
74                "legendFormat": "{{ namespace }}"
75              }
76            ]
77          }
78        ]
79      }
80    }

Part 7: Monitoring VPA

VPA Metrics

 1# ServiceMonitor for VPA components
 2apiVersion: monitoring.coreos.com/v1
 3kind: ServiceMonitor
 4metadata:
 5  name: vpa-metrics
 6  namespace: kube-system
 7spec:
 8  selector:
 9    matchLabels:
10      app: vpa
11  endpoints:
12  - port: metrics
13    interval: 30s
14
15---
16# PrometheusRule for VPA alerts
17apiVersion: monitoring.coreos.com/v1
18kind: PrometheusRule
19metadata:
20  name: vpa-alerts
21  namespace: monitoring
22spec:
23  groups:
24  - name: vpa-health
25    interval: 30s
26    rules:
27
28    # VPA recommender not running
29    - alert: VPARecommenderDown
30      expr: up{job="vpa-recommender"} == 0
31      for: 5m
32      labels:
33        severity: critical
34      annotations:
35        summary: "VPA Recommender is down"
36        description: "VPA Recommender has been down for 5 minutes"
37
38    # VPA updater not running
39    - alert: VPAUpdaterDown
40      expr: up{job="vpa-updater"} == 0
41      for: 5m
42      labels:
43        severity: critical
44      annotations:
45        summary: "VPA Updater is down"
46        description: "VPA Updater has been down for 5 minutes"
47
48    # Large discrepancy between current and recommended
49    - alert: VPARecommendationMismatch
50      expr: |
51        (
52          kube_pod_container_resource_requests{resource="cpu"}
53          /
54          kube_verticalpodautoscaler_status_recommendation_containerrecommendations_target{resource="cpu"}
55        ) > 2 or
56        (
57          kube_pod_container_resource_requests{resource="cpu"}
58          /
59          kube_verticalpodautoscaler_status_recommendation_containerrecommendations_target{resource="cpu"}
60        ) < 0.5        
61      for: 1h
62      labels:
63        severity: warning
64      annotations:
65        summary: "Pod resources deviate significantly from VPA recommendation"
66        description: "Pod {{ $labels.pod }} in {{ $labels.namespace }} has resource requests 2x different from VPA target"
67
68    # OOMKilled pods that VPA should have prevented
69    - alert: OOMKilledDespiteVPA
70      expr: |
71        increase(kube_pod_container_status_terminated_reason{reason="OOMKilled"}[1h]) > 0
72        and on(pod, namespace)
73        kube_verticalpodautoscaler_spec_updatepolicy_updatemode{update_mode!="Off"} == 1        
74      labels:
75        severity: warning
76      annotations:
77        summary: "Pod OOMKilled despite VPA enabled"
78        description: "Pod {{ $labels.pod }} was OOMKilled even though VPA is active. Review VPA maxAllowed settings."

Grafana Dashboard for VPA

 1# Dashboard showing VPA effectiveness
 2kubectl apply -f - <<EOF
 3apiVersion: v1
 4kind: ConfigMap
 5metadata:
 6  name: vpa-dashboard
 7  namespace: monitoring
 8  labels:
 9    grafana_dashboard: "1"
10data:
11  vpa-overview.json: |
12    {
13      "dashboard": {
14        "title": "VPA Overview",
15        "panels": [
16          {
17            "title": "VPA Recommendations vs Actual",
18            "type": "graph",
19            "targets": [
20              {
21                "expr": "kube_pod_container_resource_requests{resource='cpu'}",
22                "legendFormat": "Actual - {{ pod }}"
23              },
24              {
25                "expr": "kube_verticalpodautoscaler_status_recommendation_containerrecommendations_target{resource='cpu'}",
26                "legendFormat": "VPA Target - {{ target_name }}"
27              }
28            ]
29          },
30          {
31            "title": "VPA Update Events",
32            "type": "table",
33            "targets": [
34              {
35                "expr": "changes(kube_pod_container_resource_requests[1h])",
36                "format": "table"
37              }
38            ]
39          },
40          {
41            "title": "Cost Savings from VPA",
42            "type": "stat",
43            "targets": [
44              {
45                "expr": "sum(namespace:vpa_savings_potential:sum) * 730"
46              }
47            ]
48          }
49        ]
50      }
51    }
52EOF

Part 8: Troubleshooting VPA

Common Issues

Issue 1: VPA Not Generating Recommendations

Symptoms:

1kubectl describe vpa my-app-vpa
2
3# Shows:
4# Recommendation: <none>

Diagnosis:

1# Check VPA recommender logs
2kubectl logs -n kube-system deployment/vpa-recommender
3
4# Check if Metrics Server is working
5kubectl top pods -n default
6
7# Verify VPA can access metrics
8kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods

Solutions:

 1# 1. Ensure Metrics Server is installed
 2kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
 3
 4# 2. Wait for sufficient data collection (minimum 24 hours)
 5
 6# 3. Verify pod has resource requests defined
 7kubectl get deployment my-app -o yaml | grep -A 5 resources
 8
 9# 4. Restart VPA recommender
10kubectl rollout restart deployment/vpa-recommender -n kube-system

Issue 2: VPA Causing Excessive Pod Restarts

Symptoms:

Frequent pod evictions
Service disruption
High pod restart counts

Diagnosis:

1# Check pod restart events
2kubectl get events --field-selector reason=Evicted -n production
3
4# View VPA updater logs
5kubectl logs -n kube-system deployment/vpa-updater
6
7# Check PodDisruptionBudget
8kubectl get pdb -n production

Solutions:

 1# 1. Add/update PodDisruptionBudget
 2apiVersion: policy/v1
 3kind: PodDisruptionBudget
 4metadata:
 5  name: my-app-pdb
 6spec:
 7  minAvailable: 2  # Ensure minimum availability
 8
 9---
10# 2. Change VPA update mode
11apiVersion: autoscaling.k8s.io/v1
12kind: VerticalPodAutoscaler
13metadata:
14  name: my-app-vpa
15spec:
16  updatePolicy:
17    updateMode: "Initial"  # Less disruptive
18
19---
20# 3. Increase minReplicas
21spec:
22  updatePolicy:
23    minReplicas: 3  # Don't update if less than 3 replicas

Issue 3: VPA and HPA Conflict

Symptoms:

Thrashing (rapid scale up/down)
Unexpected pod restarts
Resource request fluctuations

Diagnosis:

1# Check both VPA and HPA status
2kubectl get vpa,hpa -n production
3
4# View scaling events
5kubectl get events --sort-by='.lastTimestamp' | grep -E 'Scaled|Evicted'
6
7# Check if both manage same resources
8kubectl describe vpa my-app-vpa | grep controlledResources
9kubectl describe hpa my-app-hpa | grep metrics

Solutions:

 1# Option 1: VPA for CPU, HPA for custom metrics
 2apiVersion: autoscaling.k8s.io/v1
 3kind: VerticalPodAutoscaler
 4metadata:
 5  name: my-app-vpa
 6spec:
 7  resourcePolicy:
 8    containerPolicies:
 9    - containerName: '*'
10      controlledResources: ["cpu"]  # VPA manages CPU only
11
12---
13apiVersion: autoscaling/v2
14kind: HorizontalPodAutoscaler
15metadata:
16  name: my-app-hpa
17spec:
18  metrics:
19  - type: Pods
20    pods:
21      metric:
22        name: http_requests_per_second  # HPA uses custom metric
23      target:
24        type: AverageValue
25        averageValue: "1000"
26
27---
28# Option 2: Use VPA in "Off" mode
29apiVersion: autoscaling.k8s.io/v1
30kind: VerticalPodAutoscaler
31metadata:
32  name: my-app-vpa
33spec:
34  updatePolicy:
35    updateMode: "Off"  # Recommendations only

Part 9: Best Practices

Production Checklist

✅ Before Enabling VPA:

Metrics Server installed and verified
Baseline metrics collected (minimum 7 days)
PodDisruptionBudgets configured
Resource limits defined in pod specs
Monitoring and alerting in place

✅ VPA Configuration:

Start with “Off” mode for analysis
Set appropriate min/max bounds
Use “Initial” mode for safety
Configure PDBs for “Recreate” mode
Test in non-production first

✅ When Combining VPA + HPA:

VPA manages different resources than HPA
Use “Initial” update mode
Monitor for conflicts
Document the strategy

✅ Monitoring:

Track VPA recommendations vs actual
Alert on excessive evictions
Monitor OOMKilled events
Track cost savings

Deployment Patterns

Pattern 1: Gradual Rollout

 1# Week 1: Analysis only
 2kubectl apply -f vpa-off-mode.yaml
 3
 4# Week 2: Apply to test environment
 5kubectl apply -f vpa-initial-mode-test.yaml
 6
 7# Week 3: Apply to production (low-risk services)
 8kubectl apply -f vpa-initial-mode-prod.yaml
 9
10# Week 4: Expand to more services
11kubectl apply -f vpa-recreate-mode-prod.yaml

Pattern 2: Service Tiers

1# Tier 1: Critical services - VPA Off mode
2# (manual review required)
3
4# Tier 2: Important services - VPA Initial mode
5# (apply on scale events only)
6
7# Tier 3: Standard services - VPA Recreate mode
8# (automatic updates with PDB protection)

Key Takeaways

VPA Value Proposition

Cost Optimization: 40-70% reduction in wasted resources
Performance: Right-sized pods perform better
Automation: Reduces manual resource tuning effort
Reliability: Prevents OOMKilled events

When to Use VPA

✅ Good Fit:

Unknown resource requirements
Variable workload patterns
Stateful applications
Long-running services
Cost optimization initiatives

❌ Not Recommended:

Short-lived jobs (insufficient data)
Highly variable workloads (frequent restarts)
Critical services without PDBs
When combined with HPA on same metric

VPA Mode Selection Guide

Scenario	Recommended Mode	Rationale
Initial deployment	Off → Initial	Learn first, then apply
Stateless apps	Recreate	Safe with PDBs
Stateful apps	Initial	Minimize disruption
Critical services	Off	Manual control
With HPA	Initial + Custom HPA metrics	Avoid conflicts
Testing	Off	No impact

Autoscaling Series

Part 1: Horizontal Pod Autoscaler - HPA theory and approaches
Part 2: Cluster Autoscaling - Node-level autoscaling
Part 3: Hands-On HPA Demo - Practical implementation
Part 4: Monitoring & Alerting - Observability

Conclusion

Vertical Pod Autoscaler is a powerful tool for resource optimization in Kubernetes, enabling:

Automated Right-Sizing: Data-driven resource allocation
Cost Reduction: Eliminate over-provisioning waste
Performance Improvement: Prevent throttling and OOMKills
Operational Efficiency: Reduce manual tuning effort

Implementation Roadmap

Month 1: Foundation

Install VPA components
Deploy in “Off” mode cluster-wide
Collect baseline recommendations

Month 2: Testing

Enable “Initial” mode in test environment
Validate recommendations
Establish monitoring

Month 3: Production

Gradual rollout to production
Start with non-critical services
Expand based on success

Month 4: Optimization

Fine-tune min/max bounds
Combine with HPA where appropriate
Measure cost savings

Next Steps

Install VPA: Follow installation guide for your platform
Start Small: Enable “Off” mode for a few deployments
Analyze Data: Review recommendations after 7 days
Implement Gradually: Move to “Initial” or “Recreate” mode
Monitor & Iterate: Track savings and adjust

VPA transforms resource management from guesswork to data-driven optimization, delivering significant cost savings while improving application reliability. Combined with HPA and Cluster Autoscaler, it completes the Kubernetes autoscaling toolkit.

Happy optimizing! 💰📊

Series Overview

The Resource Management Challenge

The Cost of Misconfigured Resources

Business Impact

Understanding Vertical Pod Autoscaler

VPA Architecture

VPA vs HPA Comparison

Part 1: Installing VPA

Prerequisites

Installation via Manifests

Installation via Helm

Installation via AWS CDK (EKS Integration)

Part 2: VPA Update Modes

Mode 1: Off (Recommendation Only)

Mode 2: Initial (Apply on Pod Creation Only)

Mode 3: Recreate (Apply by Restarting Pods)

Mode 4: Auto (Future - Not Yet Implemented)

Part 3: VPA Configuration Deep Dive

Basic VPA Configuration

Advanced VPA Configuration

Resource Policy Options Explained

controlledResources

controlledValues

Part 4: Combining VPA with HPA

The Challenge

Safe Combination Strategies

Strategy 1: VPA for CPU, HPA for Custom Metrics

Strategy 2: VPA Off Mode + Manual Right-sizing

Strategy 3: Separate Workloads

Configuration Matrix

Part 5: Production VPA Examples

Example 1: Stateless Web Application

Example 2: Stateful Database

Example 3: Microservices with Different Profiles

Part 6: Resource Optimization Strategies

Strategy 1: Rightsizing Workflow

Strategy 2: Cluster-Wide Optimization

Strategy 3: Cost Attribution & Showback

Part 7: Monitoring VPA

VPA Metrics

Grafana Dashboard for VPA

Part 8: Troubleshooting VPA

Common Issues

Issue 1: VPA Not Generating Recommendations

Issue 2: VPA Causing Excessive Pod Restarts

Issue 3: VPA and HPA Conflict

Part 9: Best Practices

Production Checklist

Deployment Patterns

Key Takeaways

VPA Value Proposition

When to Use VPA

VPA Mode Selection Guide

Related Topics

Autoscaling Series

Conclusion

Implementation Roadmap

Next Steps

Related Articles

Kubernetes Autoscaling Complete Guide (Part 1): Horizontal Pod Autoscaler

Kubernetes Autoscaling Complete Guide (Part 2): Cluster Autoscaling & Cloud Providers

Kubernetes Autoscaling Complete Guide (Part 4): Monitoring, Alerting & Threshold Tuning