Series Overview
This is Part 3 of the Kubernetes Autoscaling Complete Guide series:
- Part 1: Horizontal Pod Autoscaler - Application-level autoscaling theory and approaches
- Part 2: Cluster Autoscaling & Cloud Providers - Infrastructure-level autoscaling strategies
- Part 3 (This Post): Hands-On HPA Demo - Practical implementation with Apache-PHP application
After understanding the theory and strategies of Kubernetes autoscaling in Parts 1 and 2, it’s time to get hands-on. This tutorial walks through a complete end-to-end implementation of Horizontal Pod Autoscaler using a simple Apache-PHP application, demonstrating CPU-based autoscaling in action.
We’ll provision an EKS cluster using AWS CDK (TypeScript), deploy a sample PHP application with Kubernetes manifests, configure HPA, and observe the autoscaling behavior under load.
What We’ll Build
┌─────────────────────────────────────────────────────────────────┐
│ DEMO ARCHITECTURE │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AWS Infrastructure (CDK) │ │
│ │ │ │
│ │ VPC → EKS Cluster → Managed Node Group │ │
│ │ ↓ ↓ ↓ │ │
│ │ 3 AZs v1.28 t3.medium (1-5 nodes) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Kubernetes Workload (YAML) │ │
│ │ │ │
│ │ Deployment (apache-php) → Service → HPA │ │
│ │ ↓ ↓ ↓ │ │
│ │ Initial: 1 pod ClusterIP Min: 1, Max: 10 │ │
│ │ Image: k8s.gcr.io/ Port: 80 Target: 50% CPU │ │
│ │ hpa-example │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Load Testing & Monitoring │ │
│ │ │ │
│ │ Load Generator → Observability → Scaling Events │ │
│ │ ↓ ↓ ↓ │ │
│ │ BusyBox Pod kubectl top HPA Metrics │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Prerequisites
Before starting, ensure you have:
1# Required tools
2- AWS CLI v2.x
3- Node.js v18+ and npm
4- kubectl v1.28+
5- AWS CDK v2.x
6- Docker (optional, for local testing)
7
8# AWS credentials configured
9aws configure
10
11# Verify installations
12aws --version
13node --version
14kubectl version --client
15cdk --version
Project Structure
hpa-demo/
├── cdk/ # AWS CDK Infrastructure
│ ├── bin/
│ │ └── eks-hpa-demo.ts # CDK app entry point
│ ├── lib/
│ │ └── eks-hpa-demo-stack.ts # EKS cluster stack
│ ├── package.json
│ ├── tsconfig.json
│ └── cdk.json
├── k8s/ # Kubernetes Manifests
│ ├── deployment.yaml # Apache-PHP deployment
│ ├── service.yaml # ClusterIP service
│ ├── hpa.yaml # HorizontalPodAutoscaler
│ └── load-generator.yaml # Load testing pod
├── scripts/
│ ├── deploy.sh # Deployment automation
│ └── cleanup.sh # Resource cleanup
└── README.md
Part 1: Infrastructure Setup with AWS CDK
Step 1: Initialize CDK Project
1# Create project directory
2mkdir hpa-demo && cd hpa-demo
3mkdir cdk && cd cdk
4
5# Initialize CDK project
6cdk init app --language=typescript
7
8# Install dependencies
9npm install @aws-cdk/aws-eks @aws-cdk/aws-ec2 @aws-cdk/aws-iam
Step 2: Create EKS Stack
Create lib/eks-hpa-demo-stack.ts:
1import * as cdk from 'aws-cdk-lib';
2import * as eks from 'aws-cdk-lib/aws-eks';
3import * as ec2 from 'aws-cdk-lib/aws-ec2';
4import * as iam from 'aws-cdk-lib/aws-iam';
5import { Construct } from 'constructs';
6
7export class EksHpaDemoStack extends cdk.Stack {
8 public readonly cluster: eks.Cluster;
9
10 constructor(scope: Construct, id: string, props?: cdk.StackProps) {
11 super(scope, id, props);
12
13 // Create VPC for EKS cluster
14 const vpc = new ec2.Vpc(this, 'EksHpaVpc', {
15 maxAzs: 3,
16 natGateways: 1, // Cost optimization: use 1 NAT gateway
17 subnetConfiguration: [
18 {
19 cidrMask: 24,
20 name: 'public',
21 subnetType: ec2.SubnetType.PUBLIC,
22 },
23 {
24 cidrMask: 24,
25 name: 'private',
26 subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
27 },
28 ],
29 });
30
31 // IAM role for EKS cluster
32 const clusterRole = new iam.Role(this, 'EksClusterRole', {
33 assumedBy: new iam.ServicePrincipal('eks.amazonaws.com'),
34 managedPolicies: [
35 iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKSClusterPolicy'),
36 ],
37 });
38
39 // Create EKS cluster
40 this.cluster = new eks.Cluster(this, 'EksHpaCluster', {
41 version: eks.KubernetesVersion.V1_28,
42 clusterName: 'hpa-demo-cluster',
43 vpc: vpc,
44 defaultCapacity: 0, // We'll add managed node group separately
45 role: clusterRole,
46 endpointAccess: eks.EndpointAccess.PUBLIC_AND_PRIVATE,
47
48 // Enable cluster logging
49 clusterLogging: [
50 eks.ClusterLoggingTypes.API,
51 eks.ClusterLoggingTypes.AUDIT,
52 eks.ClusterLoggingTypes.AUTHENTICATOR,
53 ],
54 });
55
56 // Add managed node group
57 const nodeGroup = this.cluster.addNodegroupCapacity('hpa-demo-nodes', {
58 instanceTypes: [
59 ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MEDIUM),
60 ],
61 minSize: 1,
62 maxSize: 5,
63 desiredSize: 2,
64
65 // Use spot instances for cost savings (optional)
66 capacityType: eks.CapacityType.ON_DEMAND,
67
68 diskSize: 20,
69
70 // Node labels
71 labels: {
72 'workload-type': 'general',
73 'demo': 'hpa',
74 },
75
76 // Enable SSH access (optional)
77 // remoteAccess: {
78 // sshKeyName: 'your-key-name',
79 // },
80 });
81
82 // Install Metrics Server (required for HPA)
83 const metricsServerManifest = this.cluster.addManifest('metrics-server', {
84 apiVersion: 'v1',
85 kind: 'ServiceAccount',
86 metadata: {
87 name: 'metrics-server',
88 namespace: 'kube-system',
89 labels: {
90 'k8s-app': 'metrics-server',
91 },
92 },
93 });
94
95 // Apply Metrics Server using Helm (alternative approach)
96 const metricsServer = this.cluster.addHelmChart('MetricsServer', {
97 chart: 'metrics-server',
98 repository: 'https://kubernetes-sigs.github.io/metrics-server/',
99 namespace: 'kube-system',
100 values: {
101 args: [
102 '--cert-dir=/tmp',
103 '--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname',
104 '--kubelet-use-node-status-port',
105 '--metric-resolution=15s',
106 ],
107 },
108 });
109
110 // Output cluster details
111 new cdk.CfnOutput(this, 'ClusterName', {
112 value: this.cluster.clusterName,
113 description: 'EKS Cluster Name',
114 });
115
116 new cdk.CfnOutput(this, 'ClusterArn', {
117 value: this.cluster.clusterArn,
118 description: 'EKS Cluster ARN',
119 });
120
121 new cdk.CfnOutput(this, 'KubectlRole', {
122 value: this.cluster.kubectlRole?.roleArn || 'N/A',
123 description: 'IAM Role for kubectl access',
124 });
125
126 new cdk.CfnOutput(this, 'ConfigCommand', {
127 value: `aws eks update-kubeconfig --name ${this.cluster.clusterName} --region ${this.region}`,
128 description: 'Command to configure kubectl',
129 });
130
131 // Output for accessing cluster
132 new cdk.CfnOutput(this, 'NodeGroupName', {
133 value: nodeGroup.nodegroupName,
134 description: 'EKS Node Group Name',
135 });
136 }
137}
Step 3: CDK App Entry Point
Create bin/eks-hpa-demo.ts:
1#!/usr/bin/env node
2import 'source-map-support/register';
3import * as cdk from 'aws-cdk-lib';
4import { EksHpaDemoStack } from '../lib/eks-hpa-demo-stack';
5
6const app = new cdk.App();
7
8new EksHpaDemoStack(app, 'EksHpaDemoStack', {
9 env: {
10 account: process.env.CDK_DEFAULT_ACCOUNT,
11 region: process.env.CDK_DEFAULT_REGION || 'us-west-2',
12 },
13 description: 'EKS cluster for HPA demo with Apache-PHP application',
14 tags: {
15 Project: 'HPA-Demo',
16 Environment: 'Development',
17 ManagedBy: 'CDK',
18 },
19});
Step 4: CDK Configuration
Update cdk.json:
1{
2 "app": "npx ts-node --prefer-ts-exts bin/eks-hpa-demo.ts",
3 "watch": {
4 "include": [
5 "**"
6 ],
7 "exclude": [
8 "README.md",
9 "cdk*.json",
10 "**/*.d.ts",
11 "**/*.js",
12 "tsconfig.json",
13 "package*.json",
14 "yarn.lock",
15 "node_modules",
16 "test"
17 ]
18 },
19 "context": {
20 "@aws-cdk/aws-lambda:recognizeLayerVersion": true,
21 "@aws-cdk/core:checkSecretUsage": true,
22 "@aws-cdk/core:target-partitions": [
23 "aws",
24 "aws-cn"
25 ],
26 "@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
27 "@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
28 "@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
29 "@aws-cdk/aws-iam:minimizePolicies": true,
30 "@aws-cdk/core:validateSnapshotRemovalPolicy": true,
31 "@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
32 "@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
33 "@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
34 "@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
35 "@aws-cdk/core:enablePartitionLiterals": true,
36 "@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
37 "@aws-cdk/aws-iam:standardizedServicePrincipals": true,
38 "@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
39 "@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
40 "@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
41 "@aws-cdk/aws-route53-patters:useCertificate": true,
42 "@aws-cdk/customresources:installLatestAwsSdkDefault": false,
43 "@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
44 "@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
45 "@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
46 "@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
47 "@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
48 "@aws-cdk/aws-redshift:columnId": true,
49 "@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
50 "@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
51 "@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
52 "@aws-cdk/aws-kms:aliasNameRef": true,
53 "@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
54 "@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
55 "@aws-cdk/aws-efs:denyAnonymousAccess": true,
56 "@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
57 "@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
58 "@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
59 "@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
60 "@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
61 "@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
62 "@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
63 "@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
64 "@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
65 "@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
66 "@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
67 "@aws-cdk/aws-eks:nodegroupNameAttribute": true,
68 "@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
69 "@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
70 "@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false
71 }
72}
Step 5: Deploy Infrastructure
1# Bootstrap CDK (first time only)
2cdk bootstrap
3
4# Review CloudFormation template
5cdk synth
6
7# Deploy the stack
8cdk deploy
9
10# This will:
11# 1. Create VPC with public/private subnets
12# 2. Provision EKS cluster (takes ~15 minutes)
13# 3. Create managed node group
14# 4. Install Metrics Server
15# 5. Output kubectl configuration command
Step 6: Configure kubectl
1# Update kubeconfig (use the output from CDK deploy)
2aws eks update-kubeconfig --name hpa-demo-cluster --region us-west-2
3
4# Verify cluster access
5kubectl get nodes
6
7# Expected output:
8# NAME STATUS ROLES AGE VERSION
9# ip-10-0-1-xxx.ec2.internal Ready <none> 5m v1.28.x
10# ip-10-0-2-xxx.ec2.internal Ready <none> 5m v1.28.x
11
12# Verify Metrics Server is running
13kubectl get pods -n kube-system | grep metrics-server
14
15# Expected output:
16# metrics-server-xxx 1/1 Running 0 5m
Part 2: Application Deployment with Kubernetes YAML
Now let’s deploy the Apache-PHP application with HPA configuration.
Step 1: Create Kubernetes Manifests Directory
1cd ..
2mkdir k8s
3cd k8s
Step 2: Deployment Manifest
Create deployment.yaml:
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: php-apache
5 namespace: default
6 labels:
7 app: php-apache
8 demo: hpa
9spec:
10 # Initial replica count (HPA will manage this)
11 replicas: 1
12
13 selector:
14 matchLabels:
15 app: php-apache
16
17 template:
18 metadata:
19 labels:
20 app: php-apache
21 demo: hpa
22 annotations:
23 # Prometheus scraping (optional)
24 prometheus.io/scrape: "true"
25 prometheus.io/port: "80"
26 prometheus.io/path: "/metrics"
27
28 spec:
29 containers:
30 - name: php-apache
31 # Official HPA example image from Kubernetes
32 image: registry.k8s.io/hpa-example:latest
33 imagePullPolicy: Always
34
35 ports:
36 - containerPort: 80
37 name: http
38 protocol: TCP
39
40 # Resource requests and limits (CRITICAL for HPA)
41 resources:
42 requests:
43 # HPA calculates based on these requests
44 cpu: 200m # 200 millicores = 0.2 CPU
45 memory: 128Mi
46 limits:
47 cpu: 500m # Max 0.5 CPU
48 memory: 256Mi
49
50 # Liveness probe
51 livenessProbe:
52 httpGet:
53 path: /
54 port: 80
55 initialDelaySeconds: 15
56 periodSeconds: 10
57 timeoutSeconds: 5
58 failureThreshold: 3
59
60 # Readiness probe
61 readinessProbe:
62 httpGet:
63 path: /
64 port: 80
65 initialDelaySeconds: 5
66 periodSeconds: 5
67 timeoutSeconds: 3
68 failureThreshold: 3
69
70 # Environment variables (optional)
71 env:
72 - name: APP_ENV
73 value: "production"
74 - name: LOG_LEVEL
75 value: "info"
76
77 # Termination grace period
78 terminationGracePeriodSeconds: 30
79
80 # Security context
81 securityContext:
82 runAsNonRoot: false # Apache needs root
83 fsGroup: 33 # www-data group
Step 3: Service Manifest
Create service.yaml:
1apiVersion: v1
2kind: Service
3metadata:
4 name: php-apache
5 namespace: default
6 labels:
7 app: php-apache
8 demo: hpa
9spec:
10 type: ClusterIP # Internal service only
11
12 ports:
13 - port: 80
14 targetPort: 80
15 protocol: TCP
16 name: http
17
18 selector:
19 app: php-apache
20
21 # Session affinity (optional)
22 sessionAffinity: None
Step 4: HorizontalPodAutoscaler Manifest
Create hpa.yaml:
1apiVersion: autoscaling/v2
2kind: HorizontalPodAutoscaler
3metadata:
4 name: php-apache-hpa
5 namespace: default
6 labels:
7 app: php-apache
8 demo: hpa
9spec:
10 # Target deployment
11 scaleTargetRef:
12 apiVersion: apps/v1
13 kind: Deployment
14 name: php-apache
15
16 # Replica limits
17 minReplicas: 1
18 maxReplicas: 10
19
20 # Scaling metrics
21 metrics:
22 - type: Resource
23 resource:
24 name: cpu
25 target:
26 type: Utilization
27 averageUtilization: 50 # Target 50% CPU utilization
28
29 # Scaling behavior (optional but recommended)
30 behavior:
31 scaleDown:
32 stabilizationWindowSeconds: 300 # Wait 5 minutes before scaling down
33 policies:
34 - type: Percent
35 value: 50 # Scale down max 50% of pods at a time
36 periodSeconds: 60 # Every minute
37 - type: Pods
38 value: 2 # Or max 2 pods per minute
39 periodSeconds: 60
40 selectPolicy: Min # Choose the more conservative policy
41
42 scaleUp:
43 stabilizationWindowSeconds: 0 # Immediate scale up
44 policies:
45 - type: Percent
46 value: 100 # Double the pods
47 periodSeconds: 15 # Every 15 seconds
48 - type: Pods
49 value: 4 # Or add max 4 pods
50 periodSeconds: 15
51 selectPolicy: Max # Choose the more aggressive policy
Step 5: Load Generator Manifest
Create load-generator.yaml:
1apiVersion: v1
2kind: Pod
3metadata:
4 name: load-generator
5 namespace: default
6 labels:
7 app: load-generator
8 demo: hpa
9spec:
10 containers:
11 - name: busybox
12 image: busybox:1.36
13 command:
14 - /bin/sh
15 - -c
16 - |
17 echo "Starting load generation..."
18 echo "Target: http://php-apache.default.svc.cluster.local"
19 echo "Press Ctrl+C to stop"
20 while true; do
21 wget -q -O- http://php-apache.default.svc.cluster.local
22 sleep 0.01 # 100 requests per second
23 done
24 resources:
25 requests:
26 cpu: 100m
27 memory: 64Mi
28 limits:
29 cpu: 200m
30 memory: 128Mi
31 restartPolicy: Never
Step 6: Deploy to Kubernetes
1# Deploy all resources
2kubectl apply -f deployment.yaml
3kubectl apply -f service.yaml
4kubectl apply -f hpa.yaml
5
6# Verify deployment
7kubectl get deployments
8kubectl get pods
9kubectl get svc
10kubectl get hpa
11
12# Expected output:
13# NAME READY UP-TO-DATE AVAILABLE AGE
14# php-apache 1/1 1 1 30s
15
16# NAME READY STATUS RESTARTS AGE
17# php-apache-xxxxxxxxxx-xxxxx 1/1 Running 0 30s
18
19# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
20# php-apache ClusterIP 10.100.xx.xx <none> 80/TCP 30s
21
22# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
23# php-apache-hpa Deployment/php-apache 0%/50% 1 10 1 30s
Part 3: Testing HPA in Action
Step 1: Monitor Initial State
1# Terminal 1: Watch HPA status
2watch -n 1 kubectl get hpa php-apache-hpa
3
4# Terminal 2: Watch pod status
5watch -n 1 kubectl get pods -l app=php-apache
6
7# Terminal 3: Monitor resource usage
8watch -n 1 kubectl top pods -l app=php-apache
Step 2: Generate Load
1# Apply load generator
2kubectl apply -f load-generator.yaml
3
4# Monitor load generator logs
5kubectl logs -f load-generator
6
7# You should see continuous HTTP requests being made
Step 3: Observe Autoscaling Behavior
Timeline of Events:
Time CPU Usage Replicas HPA Action
----- --------- -------- -----------
0:00 5% 1 Normal operation
1:00 65% 1 CPU exceeds target (50%)
1:30 80% 2 Scale up to 2 pods
2:00 70% 3 Scale up to 3 pods
2:30 55% 4 Scale up to 4 pods
3:00 45% 4 Stable (below target)
Watch HPA metrics:
1# Detailed HPA status
2kubectl describe hpa php-apache-hpa
3
4# Output shows:
5# - Current CPU utilization
6# - Desired replicas calculation
7# - Scaling events
8# - Conditions
9
10# Example output:
11# Name: php-apache-hpa
12# Namespace: default
13# Reference: Deployment/php-apache
14# Metrics: ( current / target )
15# resource cpu on pods (as a percentage of request): 65% (130m) / 50%
16# Min replicas: 1
17# Max replicas: 10
18# Deployment pods: 3 current / 4 desired
19# Events:
20# Type Reason Age From Message
21# ---- ------ ---- ---- -------
22# Normal SuccessfulRescale 2m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
23# Normal SuccessfulRescale 1m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
Step 4: Monitor Metrics
1# View pod CPU/memory usage
2kubectl top pods
3
4# Expected output during load:
5# NAME CPU(cores) MEMORY(bytes)
6# php-apache-xxxxxxxxxx-xxxxx 130m 45Mi
7# php-apache-xxxxxxxxxx-yyyyy 125m 43Mi
8# php-apache-xxxxxxxxxx-zzzzz 128m 44Mi
9
10# View node resource usage
11kubectl top nodes
12
13# Check HPA metrics from API
14kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods" | jq .
Step 5: Stop Load and Observe Scale-Down
1# Delete load generator
2kubectl delete pod load-generator
3
4# Watch HPA scale down (takes 5 minutes due to stabilizationWindow)
5watch kubectl get hpa php-apache-hpa
6
7# Timeline:
8# Time after load stops:
9# 0:00 CPU drops to ~5%
10# 5:00 HPA starts scale-down
11# 5:30 Replicas reduced to 2
12# 6:00 Replicas reduced to 1 (minReplicas)
Part 4: Advanced Scenarios
Scenario 1: Adjust HPA Target
1# Edit HPA to change target utilization
2kubectl edit hpa php-apache-hpa
3
4# Change averageUtilization from 50% to 30%
5# This will cause more aggressive scaling
6
7# Or apply updated manifest
8cat <<EOF | kubectl apply -f -
9apiVersion: autoscaling/v2
10kind: HorizontalPodAutoscaler
11metadata:
12 name: php-apache-hpa
13spec:
14 scaleTargetRef:
15 apiVersion: apps/v1
16 kind: Deployment
17 name: php-apache
18 minReplicas: 2 # Changed from 1
19 maxReplicas: 15 # Changed from 10
20 metrics:
21 - type: Resource
22 resource:
23 name: cpu
24 target:
25 type: Utilization
26 averageUtilization: 30 # Changed from 50%
27EOF
Scenario 2: Memory-Based Autoscaling
Update hpa.yaml to include memory metrics:
1apiVersion: autoscaling/v2
2kind: HorizontalPodAutoscaler
3metadata:
4 name: php-apache-hpa
5spec:
6 scaleTargetRef:
7 apiVersion: apps/v1
8 kind: Deployment
9 name: php-apache
10 minReplicas: 1
11 maxReplicas: 10
12
13 metrics:
14 # CPU-based scaling
15 - type: Resource
16 resource:
17 name: cpu
18 target:
19 type: Utilization
20 averageUtilization: 50
21
22 # Memory-based scaling
23 - type: Resource
24 resource:
25 name: memory
26 target:
27 type: Utilization
28 averageUtilization: 70
29
30 behavior:
31 scaleDown:
32 stabilizationWindowSeconds: 300
33 policies:
34 - type: Pods
35 value: 1
36 periodSeconds: 60
37 scaleUp:
38 stabilizationWindowSeconds: 0
39 policies:
40 - type: Pods
41 value: 2
42 periodSeconds: 15
Scenario 3: Multiple Load Generators
Create heavier load with multiple generators:
1# Generate 3 load generators
2for i in {1..3}; do
3 kubectl run load-generator-$i --image=busybox:1.36 --restart=Never -- /bin/sh -c "while true; do wget -q -O- http://php-apache; done"
4done
5
6# Watch rapid scaling
7kubectl get hpa -w
8
9# Clean up
10kubectl delete pod -l run=load-generator
Scenario 4: Test HPA Limits
1# Set very high load to test maxReplicas
2kubectl run mega-load --image=busybox:1.36 --restart=Never -- /bin/sh -c "for i in {1..100}; do (while true; do wget -q -O- http://php-apache; done) & done; wait"
3
4# HPA will scale up to maxReplicas: 10
5kubectl get pods -l app=php-apache
6
7# Verify HPA hits the ceiling
8kubectl describe hpa php-apache-hpa | grep "ScaledToMax"
Part 5: Monitoring and Troubleshooting
View HPA Events
1# Get scaling events
2kubectl get events --field-selector involvedObject.name=php-apache-hpa --sort-by='.lastTimestamp'
3
4# Example events:
5# 5m Normal SuccessfulRescale HPA New size: 3; reason: cpu resource utilization above target
6# 2m Normal SuccessfulRescale HPA New size: 5; reason: cpu resource utilization above target
7# 1m Normal SuccessfulRescale HPA New size: 4; reason: All metrics below target
Check HPA Conditions
1# View HPA conditions
2kubectl get hpa php-apache-hpa -o yaml | grep -A 10 conditions
3
4# Healthy HPA shows:
5# - AbleToScale: True
6# - ScalingActive: True
7# - ScalingLimited: False (unless at min/max)
Troubleshooting Common Issues
Issue 1: HPA shows <unknown> for targets
1# Check if Metrics Server is running
2kubectl get pods -n kube-system | grep metrics-server
3
4# Check Metrics Server logs
5kubectl logs -n kube-system deployment/metrics-server
6
7# Verify metrics are available
8kubectl top pods
9
10# If Metrics Server not installed:
11kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Issue 2: HPA not scaling despite high CPU
1# Verify resource requests are set
2kubectl get deployment php-apache -o yaml | grep -A 5 resources
3
4# Resource requests MUST be defined for HPA to work
5# If missing, add them to deployment.yaml and reapply
6
7# Check HPA calculation
8kubectl describe hpa php-apache-hpa
9# Look for: "unable to compute replica count"
Issue 3: Pods not starting (Insufficient resources)
1# Check node resources
2kubectl describe nodes | grep -A 5 "Allocated resources"
3
4# Check pending pods
5kubectl get pods | grep Pending
6
7# Describe pending pod
8kubectl describe pod <pod-name>
9# Look for: "0/2 nodes are available: insufficient cpu"
10
11# Solution: Cluster Autoscaler will add nodes, or manually scale node group
Part 6: Cleanup
Delete Kubernetes Resources
1# Delete HPA
2kubectl delete hpa php-apache-hpa
3
4# Delete service
5kubectl delete svc php-apache
6
7# Delete deployment
8kubectl delete deployment php-apache
9
10# Delete load generator (if still running)
11kubectl delete pod load-generator
12
13# Or delete all at once
14kubectl delete -f k8s/
Destroy CDK Infrastructure
1cd cdk
2
3# Destroy the stack
4cdk destroy
5
6# Confirm when prompted
7# This will:
8# - Delete EKS cluster
9# - Remove node group
10# - Delete VPC and subnets
11# - Clean up all AWS resources
12
13# Note: Cluster deletion takes ~10 minutes
Verify Cleanup
1# Verify no EKS clusters
2aws eks list-clusters --region us-west-2
3
4# Verify no running EC2 instances
5aws ec2 describe-instances --filters "Name=tag:Project,Values=HPA-Demo" --query "Reservations[].Instances[].InstanceId"
6
7# Check CloudFormation stacks
8aws cloudformation list-stacks --stack-status-filter DELETE_IN_PROGRESS DELETE_COMPLETE
Key Takeaways
HPA Formula
The HPA controller uses this formula to calculate desired replicas:
desiredReplicas = ceil[currentReplicas * (currentMetricValue / targetMetricValue)]
Example:
- Current replicas: 2
- Current CPU: 120m per pod (60% of 200m request)
- Target CPU: 50%
desiredReplicas = ceil[2 * (60% / 50%)] = ceil[2 * 1.2] = ceil[2.4] = 3
Best Practices Demonstrated
Resource Requests are Mandatory
- HPA calculates based on percentage of requested resources
- Without requests, HPA cannot function
Conservative Scale-Down
- 5-minute stabilization window prevents flapping
- Gradual scale-down (50% or 2 pods max per minute)
Aggressive Scale-Up
- Immediate response to load spikes (0s stabilization)
- Fast scale-up (100% or 4 pods per 15 seconds)
Realistic Limits
- minReplicas: 1 (for demo; use 2+ in production)
- maxReplicas: 10 (adjust based on cluster capacity)
Combined with Readiness Probes
- New pods only receive traffic when ready
- Prevents cascading failures during scale-up
Monitoring Checklist
1# Essential commands for HPA monitoring
2kubectl get hpa # Quick status
3kubectl describe hpa <name> # Detailed info
4kubectl top pods # Resource usage
5kubectl get events --watch # Real-time events
6kubectl logs -f deployment/<name> # Application logs
Related Topics
For more autoscaling knowledge, explore the series:
Autoscaling Series
- Part 1: Horizontal Pod Autoscaler - Theory and approaches
- Part 2: Cluster Autoscaling - Node-level autoscaling
Kubernetes Fundamentals
- Kubernetes Complete Guide (Part 1): Introduction - Architecture and concepts
- Kubernetes Complete Guide (Part 3): Advanced Features - Production practices
Production Kubernetes
- Building Production Kubernetes Platform on AWS EKS - Full platform architecture
Conclusion
This hands-on tutorial demonstrated a complete end-to-end HPA implementation:
- Infrastructure as Code: Provisioned EKS cluster with AWS CDK in TypeScript
- Application Deployment: Used Kubernetes YAML manifests for declarative deployment
- HPA Configuration: Configured CPU-based autoscaling with behavioral controls
- Load Testing: Observed real-time scaling under load
- Production Patterns: Demonstrated best practices for stable autoscaling
What You Learned
- ✅ Setting up EKS cluster with CDK
- ✅ Deploying applications with resource requests
- ✅ Configuring HPA with scaling behaviors
- ✅ Load testing autoscaling behavior
- ✅ Monitoring and troubleshooting HPA
- ✅ Cleanup and cost management
Next Steps
- Experiment with different targets: Try 30%, 70% CPU utilization
- Add memory metrics: Implement multi-metric autoscaling
- Integrate custom metrics: Use Prometheus Adapter for application-specific metrics
- Deploy to production: Apply these patterns to real applications
- Combine with Cluster Autoscaler: See Part 2 for node-level autoscaling
The patterns demonstrated here form the foundation for production-grade Kubernetes autoscaling. Start simple with CPU-based HPA, then progressively adopt advanced techniques as your needs grow.
Happy autoscaling! 🚀