Kubernetes Autoscaling Complete Guide (Part 3): Hands-On HPA Demo with Apache-PHP

Series Overview

This is Part 3 of the Kubernetes Autoscaling Complete Guide series:


After understanding the theory and strategies of Kubernetes autoscaling in Parts 1 and 2, it’s time to get hands-on. This tutorial walks through a complete end-to-end implementation of Horizontal Pod Autoscaler using a simple Apache-PHP application, demonstrating CPU-based autoscaling in action.

We’ll provision an EKS cluster using AWS CDK (TypeScript), deploy a sample PHP application with Kubernetes manifests, configure HPA, and observe the autoscaling behavior under load.

What We’ll Build

┌─────────────────────────────────────────────────────────────────┐
│                    DEMO ARCHITECTURE                           │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              AWS Infrastructure (CDK)                    │  │
│  │                                                           │  │
│  │  VPC → EKS Cluster → Managed Node Group                 │  │
│  │   ↓        ↓               ↓                             │  │
│  │  3 AZs   v1.28      t3.medium (1-5 nodes)               │  │
│  └──────────────────────────────────────────────────────────┘  │
│                          ↓                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │         Kubernetes Workload (YAML)                       │  │
│  │                                                           │  │
│  │  Deployment (apache-php) → Service → HPA                │  │
│  │       ↓                        ↓          ↓               │  │
│  │  Initial: 1 pod          ClusterIP    Min: 1, Max: 10   │  │
│  │  Image: k8s.gcr.io/      Port: 80     Target: 50% CPU   │  │
│  │         hpa-example                                       │  │
│  └──────────────────────────────────────────────────────────┘  │
│                          ↓                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              Load Testing & Monitoring                   │  │
│  │                                                           │  │
│  │  Load Generator → Observability → Scaling Events        │  │
│  │       ↓                 ↓                ↓                │  │
│  │  BusyBox Pod      kubectl top      HPA Metrics          │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Prerequisites

Before starting, ensure you have:

 1# Required tools
 2- AWS CLI v2.x
 3- Node.js v18+ and npm
 4- kubectl v1.28+
 5- AWS CDK v2.x
 6- Docker (optional, for local testing)
 7
 8# AWS credentials configured
 9aws configure
10
11# Verify installations
12aws --version
13node --version
14kubectl version --client
15cdk --version

Project Structure

hpa-demo/
├── cdk/                          # AWS CDK Infrastructure
│   ├── bin/
│   │   └── eks-hpa-demo.ts      # CDK app entry point
│   ├── lib/
│   │   └── eks-hpa-demo-stack.ts # EKS cluster stack
│   ├── package.json
│   ├── tsconfig.json
│   └── cdk.json
├── k8s/                          # Kubernetes Manifests
│   ├── deployment.yaml          # Apache-PHP deployment
│   ├── service.yaml             # ClusterIP service
│   ├── hpa.yaml                 # HorizontalPodAutoscaler
│   └── load-generator.yaml      # Load testing pod
├── scripts/
│   ├── deploy.sh                # Deployment automation
│   └── cleanup.sh               # Resource cleanup
└── README.md

Part 1: Infrastructure Setup with AWS CDK

Step 1: Initialize CDK Project

1# Create project directory
2mkdir hpa-demo && cd hpa-demo
3mkdir cdk && cd cdk
4
5# Initialize CDK project
6cdk init app --language=typescript
7
8# Install dependencies
9npm install @aws-cdk/aws-eks @aws-cdk/aws-ec2 @aws-cdk/aws-iam

Step 2: Create EKS Stack

Create lib/eks-hpa-demo-stack.ts:

  1import * as cdk from 'aws-cdk-lib';
  2import * as eks from 'aws-cdk-lib/aws-eks';
  3import * as ec2 from 'aws-cdk-lib/aws-ec2';
  4import * as iam from 'aws-cdk-lib/aws-iam';
  5import { Construct } from 'constructs';
  6
  7export class EksHpaDemoStack extends cdk.Stack {
  8  public readonly cluster: eks.Cluster;
  9
 10  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
 11    super(scope, id, props);
 12
 13    // Create VPC for EKS cluster
 14    const vpc = new ec2.Vpc(this, 'EksHpaVpc', {
 15      maxAzs: 3,
 16      natGateways: 1, // Cost optimization: use 1 NAT gateway
 17      subnetConfiguration: [
 18        {
 19          cidrMask: 24,
 20          name: 'public',
 21          subnetType: ec2.SubnetType.PUBLIC,
 22        },
 23        {
 24          cidrMask: 24,
 25          name: 'private',
 26          subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
 27        },
 28      ],
 29    });
 30
 31    // IAM role for EKS cluster
 32    const clusterRole = new iam.Role(this, 'EksClusterRole', {
 33      assumedBy: new iam.ServicePrincipal('eks.amazonaws.com'),
 34      managedPolicies: [
 35        iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKSClusterPolicy'),
 36      ],
 37    });
 38
 39    // Create EKS cluster
 40    this.cluster = new eks.Cluster(this, 'EksHpaCluster', {
 41      version: eks.KubernetesVersion.V1_28,
 42      clusterName: 'hpa-demo-cluster',
 43      vpc: vpc,
 44      defaultCapacity: 0, // We'll add managed node group separately
 45      role: clusterRole,
 46      endpointAccess: eks.EndpointAccess.PUBLIC_AND_PRIVATE,
 47
 48      // Enable cluster logging
 49      clusterLogging: [
 50        eks.ClusterLoggingTypes.API,
 51        eks.ClusterLoggingTypes.AUDIT,
 52        eks.ClusterLoggingTypes.AUTHENTICATOR,
 53      ],
 54    });
 55
 56    // Add managed node group
 57    const nodeGroup = this.cluster.addNodegroupCapacity('hpa-demo-nodes', {
 58      instanceTypes: [
 59        ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MEDIUM),
 60      ],
 61      minSize: 1,
 62      maxSize: 5,
 63      desiredSize: 2,
 64
 65      // Use spot instances for cost savings (optional)
 66      capacityType: eks.CapacityType.ON_DEMAND,
 67
 68      diskSize: 20,
 69
 70      // Node labels
 71      labels: {
 72        'workload-type': 'general',
 73        'demo': 'hpa',
 74      },
 75
 76      // Enable SSH access (optional)
 77      // remoteAccess: {
 78      //   sshKeyName: 'your-key-name',
 79      // },
 80    });
 81
 82    // Install Metrics Server (required for HPA)
 83    const metricsServerManifest = this.cluster.addManifest('metrics-server', {
 84      apiVersion: 'v1',
 85      kind: 'ServiceAccount',
 86      metadata: {
 87        name: 'metrics-server',
 88        namespace: 'kube-system',
 89        labels: {
 90          'k8s-app': 'metrics-server',
 91        },
 92      },
 93    });
 94
 95    // Apply Metrics Server using Helm (alternative approach)
 96    const metricsServer = this.cluster.addHelmChart('MetricsServer', {
 97      chart: 'metrics-server',
 98      repository: 'https://kubernetes-sigs.github.io/metrics-server/',
 99      namespace: 'kube-system',
100      values: {
101        args: [
102          '--cert-dir=/tmp',
103          '--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname',
104          '--kubelet-use-node-status-port',
105          '--metric-resolution=15s',
106        ],
107      },
108    });
109
110    // Output cluster details
111    new cdk.CfnOutput(this, 'ClusterName', {
112      value: this.cluster.clusterName,
113      description: 'EKS Cluster Name',
114    });
115
116    new cdk.CfnOutput(this, 'ClusterArn', {
117      value: this.cluster.clusterArn,
118      description: 'EKS Cluster ARN',
119    });
120
121    new cdk.CfnOutput(this, 'KubectlRole', {
122      value: this.cluster.kubectlRole?.roleArn || 'N/A',
123      description: 'IAM Role for kubectl access',
124    });
125
126    new cdk.CfnOutput(this, 'ConfigCommand', {
127      value: `aws eks update-kubeconfig --name ${this.cluster.clusterName} --region ${this.region}`,
128      description: 'Command to configure kubectl',
129    });
130
131    // Output for accessing cluster
132    new cdk.CfnOutput(this, 'NodeGroupName', {
133      value: nodeGroup.nodegroupName,
134      description: 'EKS Node Group Name',
135    });
136  }
137}

Step 3: CDK App Entry Point

Create bin/eks-hpa-demo.ts:

 1#!/usr/bin/env node
 2import 'source-map-support/register';
 3import * as cdk from 'aws-cdk-lib';
 4import { EksHpaDemoStack } from '../lib/eks-hpa-demo-stack';
 5
 6const app = new cdk.App();
 7
 8new EksHpaDemoStack(app, 'EksHpaDemoStack', {
 9  env: {
10    account: process.env.CDK_DEFAULT_ACCOUNT,
11    region: process.env.CDK_DEFAULT_REGION || 'us-west-2',
12  },
13  description: 'EKS cluster for HPA demo with Apache-PHP application',
14  tags: {
15    Project: 'HPA-Demo',
16    Environment: 'Development',
17    ManagedBy: 'CDK',
18  },
19});

Step 4: CDK Configuration

Update cdk.json:

 1{
 2  "app": "npx ts-node --prefer-ts-exts bin/eks-hpa-demo.ts",
 3  "watch": {
 4    "include": [
 5      "**"
 6    ],
 7    "exclude": [
 8      "README.md",
 9      "cdk*.json",
10      "**/*.d.ts",
11      "**/*.js",
12      "tsconfig.json",
13      "package*.json",
14      "yarn.lock",
15      "node_modules",
16      "test"
17    ]
18  },
19  "context": {
20    "@aws-cdk/aws-lambda:recognizeLayerVersion": true,
21    "@aws-cdk/core:checkSecretUsage": true,
22    "@aws-cdk/core:target-partitions": [
23      "aws",
24      "aws-cn"
25    ],
26    "@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
27    "@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
28    "@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
29    "@aws-cdk/aws-iam:minimizePolicies": true,
30    "@aws-cdk/core:validateSnapshotRemovalPolicy": true,
31    "@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
32    "@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
33    "@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
34    "@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
35    "@aws-cdk/core:enablePartitionLiterals": true,
36    "@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
37    "@aws-cdk/aws-iam:standardizedServicePrincipals": true,
38    "@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
39    "@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
40    "@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
41    "@aws-cdk/aws-route53-patters:useCertificate": true,
42    "@aws-cdk/customresources:installLatestAwsSdkDefault": false,
43    "@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
44    "@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
45    "@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
46    "@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
47    "@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
48    "@aws-cdk/aws-redshift:columnId": true,
49    "@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
50    "@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
51    "@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
52    "@aws-cdk/aws-kms:aliasNameRef": true,
53    "@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
54    "@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
55    "@aws-cdk/aws-efs:denyAnonymousAccess": true,
56    "@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
57    "@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
58    "@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
59    "@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
60    "@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
61    "@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
62    "@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
63    "@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
64    "@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
65    "@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
66    "@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
67    "@aws-cdk/aws-eks:nodegroupNameAttribute": true,
68    "@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
69    "@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
70    "@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false
71  }
72}

Step 5: Deploy Infrastructure

 1# Bootstrap CDK (first time only)
 2cdk bootstrap
 3
 4# Review CloudFormation template
 5cdk synth
 6
 7# Deploy the stack
 8cdk deploy
 9
10# This will:
11# 1. Create VPC with public/private subnets
12# 2. Provision EKS cluster (takes ~15 minutes)
13# 3. Create managed node group
14# 4. Install Metrics Server
15# 5. Output kubectl configuration command

Step 6: Configure kubectl

 1# Update kubeconfig (use the output from CDK deploy)
 2aws eks update-kubeconfig --name hpa-demo-cluster --region us-west-2
 3
 4# Verify cluster access
 5kubectl get nodes
 6
 7# Expected output:
 8# NAME                          STATUS   ROLES    AGE   VERSION
 9# ip-10-0-1-xxx.ec2.internal    Ready    <none>   5m    v1.28.x
10# ip-10-0-2-xxx.ec2.internal    Ready    <none>   5m    v1.28.x
11
12# Verify Metrics Server is running
13kubectl get pods -n kube-system | grep metrics-server
14
15# Expected output:
16# metrics-server-xxx   1/1     Running   0          5m

Part 2: Application Deployment with Kubernetes YAML

Now let’s deploy the Apache-PHP application with HPA configuration.

Step 1: Create Kubernetes Manifests Directory

1cd ..
2mkdir k8s
3cd k8s

Step 2: Deployment Manifest

Create deployment.yaml:

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: php-apache
 5  namespace: default
 6  labels:
 7    app: php-apache
 8    demo: hpa
 9spec:
10  # Initial replica count (HPA will manage this)
11  replicas: 1
12
13  selector:
14    matchLabels:
15      app: php-apache
16
17  template:
18    metadata:
19      labels:
20        app: php-apache
21        demo: hpa
22      annotations:
23        # Prometheus scraping (optional)
24        prometheus.io/scrape: "true"
25        prometheus.io/port: "80"
26        prometheus.io/path: "/metrics"
27
28    spec:
29      containers:
30      - name: php-apache
31        # Official HPA example image from Kubernetes
32        image: registry.k8s.io/hpa-example:latest
33        imagePullPolicy: Always
34
35        ports:
36        - containerPort: 80
37          name: http
38          protocol: TCP
39
40        # Resource requests and limits (CRITICAL for HPA)
41        resources:
42          requests:
43            # HPA calculates based on these requests
44            cpu: 200m      # 200 millicores = 0.2 CPU
45            memory: 128Mi
46          limits:
47            cpu: 500m      # Max 0.5 CPU
48            memory: 256Mi
49
50        # Liveness probe
51        livenessProbe:
52          httpGet:
53            path: /
54            port: 80
55          initialDelaySeconds: 15
56          periodSeconds: 10
57          timeoutSeconds: 5
58          failureThreshold: 3
59
60        # Readiness probe
61        readinessProbe:
62          httpGet:
63            path: /
64            port: 80
65          initialDelaySeconds: 5
66          periodSeconds: 5
67          timeoutSeconds: 3
68          failureThreshold: 3
69
70        # Environment variables (optional)
71        env:
72        - name: APP_ENV
73          value: "production"
74        - name: LOG_LEVEL
75          value: "info"
76
77      # Termination grace period
78      terminationGracePeriodSeconds: 30
79
80      # Security context
81      securityContext:
82        runAsNonRoot: false  # Apache needs root
83        fsGroup: 33          # www-data group

Step 3: Service Manifest

Create service.yaml:

 1apiVersion: v1
 2kind: Service
 3metadata:
 4  name: php-apache
 5  namespace: default
 6  labels:
 7    app: php-apache
 8    demo: hpa
 9spec:
10  type: ClusterIP  # Internal service only
11
12  ports:
13  - port: 80
14    targetPort: 80
15    protocol: TCP
16    name: http
17
18  selector:
19    app: php-apache
20
21  # Session affinity (optional)
22  sessionAffinity: None

Step 4: HorizontalPodAutoscaler Manifest

Create hpa.yaml:

 1apiVersion: autoscaling/v2
 2kind: HorizontalPodAutoscaler
 3metadata:
 4  name: php-apache-hpa
 5  namespace: default
 6  labels:
 7    app: php-apache
 8    demo: hpa
 9spec:
10  # Target deployment
11  scaleTargetRef:
12    apiVersion: apps/v1
13    kind: Deployment
14    name: php-apache
15
16  # Replica limits
17  minReplicas: 1
18  maxReplicas: 10
19
20  # Scaling metrics
21  metrics:
22  - type: Resource
23    resource:
24      name: cpu
25      target:
26        type: Utilization
27        averageUtilization: 50  # Target 50% CPU utilization
28
29  # Scaling behavior (optional but recommended)
30  behavior:
31    scaleDown:
32      stabilizationWindowSeconds: 300  # Wait 5 minutes before scaling down
33      policies:
34      - type: Percent
35        value: 50          # Scale down max 50% of pods at a time
36        periodSeconds: 60  # Every minute
37      - type: Pods
38        value: 2           # Or max 2 pods per minute
39        periodSeconds: 60
40      selectPolicy: Min    # Choose the more conservative policy
41
42    scaleUp:
43      stabilizationWindowSeconds: 0      # Immediate scale up
44      policies:
45      - type: Percent
46        value: 100         # Double the pods
47        periodSeconds: 15  # Every 15 seconds
48      - type: Pods
49        value: 4           # Or add max 4 pods
50        periodSeconds: 15
51      selectPolicy: Max    # Choose the more aggressive policy

Step 5: Load Generator Manifest

Create load-generator.yaml:

 1apiVersion: v1
 2kind: Pod
 3metadata:
 4  name: load-generator
 5  namespace: default
 6  labels:
 7    app: load-generator
 8    demo: hpa
 9spec:
10  containers:
11  - name: busybox
12    image: busybox:1.36
13    command:
14    - /bin/sh
15    - -c
16    - |
17      echo "Starting load generation..."
18      echo "Target: http://php-apache.default.svc.cluster.local"
19      echo "Press Ctrl+C to stop"
20      while true; do
21        wget -q -O- http://php-apache.default.svc.cluster.local
22        sleep 0.01  # 100 requests per second
23      done      
24    resources:
25      requests:
26        cpu: 100m
27        memory: 64Mi
28      limits:
29        cpu: 200m
30        memory: 128Mi
31  restartPolicy: Never

Step 6: Deploy to Kubernetes

 1# Deploy all resources
 2kubectl apply -f deployment.yaml
 3kubectl apply -f service.yaml
 4kubectl apply -f hpa.yaml
 5
 6# Verify deployment
 7kubectl get deployments
 8kubectl get pods
 9kubectl get svc
10kubectl get hpa
11
12# Expected output:
13# NAME         READY   UP-TO-DATE   AVAILABLE   AGE
14# php-apache   1/1     1            1           30s
15
16# NAME                          READY   STATUS    RESTARTS   AGE
17# php-apache-xxxxxxxxxx-xxxxx   1/1     Running   0          30s
18
19# NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
20# php-apache   ClusterIP   10.100.xx.xx    <none>        80/TCP    30s
21
22# NAME             REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
23# php-apache-hpa   Deployment/php-apache   0%/50%    1         10        1          30s

Part 3: Testing HPA in Action

Step 1: Monitor Initial State

1# Terminal 1: Watch HPA status
2watch -n 1 kubectl get hpa php-apache-hpa
3
4# Terminal 2: Watch pod status
5watch -n 1 kubectl get pods -l app=php-apache
6
7# Terminal 3: Monitor resource usage
8watch -n 1 kubectl top pods -l app=php-apache

Step 2: Generate Load

1# Apply load generator
2kubectl apply -f load-generator.yaml
3
4# Monitor load generator logs
5kubectl logs -f load-generator
6
7# You should see continuous HTTP requests being made

Step 3: Observe Autoscaling Behavior

Timeline of Events:

Time    CPU Usage    Replicas    HPA Action
-----   ---------    --------    -----------
0:00    5%           1           Normal operation
1:00    65%          1           CPU exceeds target (50%)
1:30    80%          2           Scale up to 2 pods
2:00    70%          3           Scale up to 3 pods
2:30    55%          4           Scale up to 4 pods
3:00    45%          4           Stable (below target)

Watch HPA metrics:

 1# Detailed HPA status
 2kubectl describe hpa php-apache-hpa
 3
 4# Output shows:
 5# - Current CPU utilization
 6# - Desired replicas calculation
 7# - Scaling events
 8# - Conditions
 9
10# Example output:
11# Name:                                                  php-apache-hpa
12# Namespace:                                             default
13# Reference:                                             Deployment/php-apache
14# Metrics:                                               ( current / target )
15#   resource cpu on pods  (as a percentage of request):  65% (130m) / 50%
16# Min replicas:                                          1
17# Max replicas:                                          10
18# Deployment pods:                                       3 current / 4 desired
19# Events:
20#   Type     Reason             Age   From                       Message
21#   ----     ------             ----  ----                       -------
22#   Normal   SuccessfulRescale  2m    horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
23#   Normal   SuccessfulRescale  1m    horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target

Step 4: Monitor Metrics

 1# View pod CPU/memory usage
 2kubectl top pods
 3
 4# Expected output during load:
 5# NAME                          CPU(cores)   MEMORY(bytes)
 6# php-apache-xxxxxxxxxx-xxxxx   130m         45Mi
 7# php-apache-xxxxxxxxxx-yyyyy   125m         43Mi
 8# php-apache-xxxxxxxxxx-zzzzz   128m         44Mi
 9
10# View node resource usage
11kubectl top nodes
12
13# Check HPA metrics from API
14kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods" | jq .

Step 5: Stop Load and Observe Scale-Down

 1# Delete load generator
 2kubectl delete pod load-generator
 3
 4# Watch HPA scale down (takes 5 minutes due to stabilizationWindow)
 5watch kubectl get hpa php-apache-hpa
 6
 7# Timeline:
 8# Time after load stops:
 9# 0:00    CPU drops to ~5%
10# 5:00    HPA starts scale-down
11# 5:30    Replicas reduced to 2
12# 6:00    Replicas reduced to 1 (minReplicas)

Part 4: Advanced Scenarios

Scenario 1: Adjust HPA Target

 1# Edit HPA to change target utilization
 2kubectl edit hpa php-apache-hpa
 3
 4# Change averageUtilization from 50% to 30%
 5# This will cause more aggressive scaling
 6
 7# Or apply updated manifest
 8cat <<EOF | kubectl apply -f -
 9apiVersion: autoscaling/v2
10kind: HorizontalPodAutoscaler
11metadata:
12  name: php-apache-hpa
13spec:
14  scaleTargetRef:
15    apiVersion: apps/v1
16    kind: Deployment
17    name: php-apache
18  minReplicas: 2      # Changed from 1
19  maxReplicas: 15     # Changed from 10
20  metrics:
21  - type: Resource
22    resource:
23      name: cpu
24      target:
25        type: Utilization
26        averageUtilization: 30  # Changed from 50%
27EOF

Scenario 2: Memory-Based Autoscaling

Update hpa.yaml to include memory metrics:

 1apiVersion: autoscaling/v2
 2kind: HorizontalPodAutoscaler
 3metadata:
 4  name: php-apache-hpa
 5spec:
 6  scaleTargetRef:
 7    apiVersion: apps/v1
 8    kind: Deployment
 9    name: php-apache
10  minReplicas: 1
11  maxReplicas: 10
12
13  metrics:
14  # CPU-based scaling
15  - type: Resource
16    resource:
17      name: cpu
18      target:
19        type: Utilization
20        averageUtilization: 50
21
22  # Memory-based scaling
23  - type: Resource
24    resource:
25      name: memory
26      target:
27        type: Utilization
28        averageUtilization: 70
29
30  behavior:
31    scaleDown:
32      stabilizationWindowSeconds: 300
33      policies:
34      - type: Pods
35        value: 1
36        periodSeconds: 60
37    scaleUp:
38      stabilizationWindowSeconds: 0
39      policies:
40      - type: Pods
41        value: 2
42        periodSeconds: 15

Scenario 3: Multiple Load Generators

Create heavier load with multiple generators:

 1# Generate 3 load generators
 2for i in {1..3}; do
 3  kubectl run load-generator-$i --image=busybox:1.36 --restart=Never -- /bin/sh -c "while true; do wget -q -O- http://php-apache; done"
 4done
 5
 6# Watch rapid scaling
 7kubectl get hpa -w
 8
 9# Clean up
10kubectl delete pod -l run=load-generator

Scenario 4: Test HPA Limits

1# Set very high load to test maxReplicas
2kubectl run mega-load --image=busybox:1.36 --restart=Never -- /bin/sh -c "for i in {1..100}; do (while true; do wget -q -O- http://php-apache; done) & done; wait"
3
4# HPA will scale up to maxReplicas: 10
5kubectl get pods -l app=php-apache
6
7# Verify HPA hits the ceiling
8kubectl describe hpa php-apache-hpa | grep "ScaledToMax"

Part 5: Monitoring and Troubleshooting

View HPA Events

1# Get scaling events
2kubectl get events --field-selector involvedObject.name=php-apache-hpa --sort-by='.lastTimestamp'
3
4# Example events:
5# 5m   Normal  SuccessfulRescale  HPA  New size: 3; reason: cpu resource utilization above target
6# 2m   Normal  SuccessfulRescale  HPA  New size: 5; reason: cpu resource utilization above target
7# 1m   Normal  SuccessfulRescale  HPA  New size: 4; reason: All metrics below target

Check HPA Conditions

1# View HPA conditions
2kubectl get hpa php-apache-hpa -o yaml | grep -A 10 conditions
3
4# Healthy HPA shows:
5# - AbleToScale: True
6# - ScalingActive: True
7# - ScalingLimited: False (unless at min/max)

Troubleshooting Common Issues

Issue 1: HPA shows <unknown> for targets

 1# Check if Metrics Server is running
 2kubectl get pods -n kube-system | grep metrics-server
 3
 4# Check Metrics Server logs
 5kubectl logs -n kube-system deployment/metrics-server
 6
 7# Verify metrics are available
 8kubectl top pods
 9
10# If Metrics Server not installed:
11kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Issue 2: HPA not scaling despite high CPU

1# Verify resource requests are set
2kubectl get deployment php-apache -o yaml | grep -A 5 resources
3
4# Resource requests MUST be defined for HPA to work
5# If missing, add them to deployment.yaml and reapply
6
7# Check HPA calculation
8kubectl describe hpa php-apache-hpa
9# Look for: "unable to compute replica count"

Issue 3: Pods not starting (Insufficient resources)

 1# Check node resources
 2kubectl describe nodes | grep -A 5 "Allocated resources"
 3
 4# Check pending pods
 5kubectl get pods | grep Pending
 6
 7# Describe pending pod
 8kubectl describe pod <pod-name>
 9# Look for: "0/2 nodes are available: insufficient cpu"
10
11# Solution: Cluster Autoscaler will add nodes, or manually scale node group

Part 6: Cleanup

Delete Kubernetes Resources

 1# Delete HPA
 2kubectl delete hpa php-apache-hpa
 3
 4# Delete service
 5kubectl delete svc php-apache
 6
 7# Delete deployment
 8kubectl delete deployment php-apache
 9
10# Delete load generator (if still running)
11kubectl delete pod load-generator
12
13# Or delete all at once
14kubectl delete -f k8s/

Destroy CDK Infrastructure

 1cd cdk
 2
 3# Destroy the stack
 4cdk destroy
 5
 6# Confirm when prompted
 7# This will:
 8# - Delete EKS cluster
 9# - Remove node group
10# - Delete VPC and subnets
11# - Clean up all AWS resources
12
13# Note: Cluster deletion takes ~10 minutes

Verify Cleanup

1# Verify no EKS clusters
2aws eks list-clusters --region us-west-2
3
4# Verify no running EC2 instances
5aws ec2 describe-instances --filters "Name=tag:Project,Values=HPA-Demo" --query "Reservations[].Instances[].InstanceId"
6
7# Check CloudFormation stacks
8aws cloudformation list-stacks --stack-status-filter DELETE_IN_PROGRESS DELETE_COMPLETE

Key Takeaways

HPA Formula

The HPA controller uses this formula to calculate desired replicas:

desiredReplicas = ceil[currentReplicas * (currentMetricValue / targetMetricValue)]

Example:
- Current replicas: 2
- Current CPU: 120m per pod (60% of 200m request)
- Target CPU: 50%

desiredReplicas = ceil[2 * (60% / 50%)] = ceil[2 * 1.2] = ceil[2.4] = 3

Best Practices Demonstrated

  1. Resource Requests are Mandatory

    • HPA calculates based on percentage of requested resources
    • Without requests, HPA cannot function
  2. Conservative Scale-Down

    • 5-minute stabilization window prevents flapping
    • Gradual scale-down (50% or 2 pods max per minute)
  3. Aggressive Scale-Up

    • Immediate response to load spikes (0s stabilization)
    • Fast scale-up (100% or 4 pods per 15 seconds)
  4. Realistic Limits

    • minReplicas: 1 (for demo; use 2+ in production)
    • maxReplicas: 10 (adjust based on cluster capacity)
  5. Combined with Readiness Probes

    • New pods only receive traffic when ready
    • Prevents cascading failures during scale-up

Monitoring Checklist

1# Essential commands for HPA monitoring
2kubectl get hpa                    # Quick status
3kubectl describe hpa <name>        # Detailed info
4kubectl top pods                   # Resource usage
5kubectl get events --watch         # Real-time events
6kubectl logs -f deployment/<name>  # Application logs

For more autoscaling knowledge, explore the series:

Autoscaling Series

Kubernetes Fundamentals

Production Kubernetes

Conclusion

This hands-on tutorial demonstrated a complete end-to-end HPA implementation:

  1. Infrastructure as Code: Provisioned EKS cluster with AWS CDK in TypeScript
  2. Application Deployment: Used Kubernetes YAML manifests for declarative deployment
  3. HPA Configuration: Configured CPU-based autoscaling with behavioral controls
  4. Load Testing: Observed real-time scaling under load
  5. Production Patterns: Demonstrated best practices for stable autoscaling

What You Learned

  • ✅ Setting up EKS cluster with CDK
  • ✅ Deploying applications with resource requests
  • ✅ Configuring HPA with scaling behaviors
  • ✅ Load testing autoscaling behavior
  • ✅ Monitoring and troubleshooting HPA
  • ✅ Cleanup and cost management

Next Steps

  1. Experiment with different targets: Try 30%, 70% CPU utilization
  2. Add memory metrics: Implement multi-metric autoscaling
  3. Integrate custom metrics: Use Prometheus Adapter for application-specific metrics
  4. Deploy to production: Apply these patterns to real applications
  5. Combine with Cluster Autoscaler: See Part 2 for node-level autoscaling

The patterns demonstrated here form the foundation for production-grade Kubernetes autoscaling. Start simple with CPU-based HPA, then progressively adopt advanced techniques as your needs grow.

Happy autoscaling! 🚀