Multidimensional Pod Autoscaling: How It Works

If you run Kubernetes at scale, you’ve probably seen this: you turn on Vertical Pod Autoscaler (VPA) to rightsize requests and your utilization shoots up (great!) – then your Horizontal Pod Autoscaler (HPA) reads that higher utilization as “we’re under pressure” and scales out replicas (not great). Net effect: costs go up right when you were trying to save.

Today we’re launching Phase 1 of Multidimensional Pod Autoscaling (MPA) in nOps Container Rightsizing: automatic HPA threshold alignment that keeps horizontal scaling in sync with your dynamically rightsized containers. It works with your existing HPAs, requires no per-workload tinkering, and rolls back cleanly.

This feature is intended for platform engineers, SREs, and FinOps leaders who want VPA savings and predictable HPA behavior – without babysitting dozens of YAMLs.

What you’ll learn in this post

Why VPA + HPA can fight each other (and how that burns money)
How nOps now auto-aligns HPA thresholds with rightsized requests
Exactly how it works (flow, YAML, and commands)
DIY option for teams that prefer manual control
Limitations, GitOps notes (ArgoCD), and what’s next for MPA

The Problem: VPA makes pods efficient; HPA misreads that as “scale out”

When you introduce a Vertical Pod Autoscaler (VPA), the goal is simple: improve usage and request efficiency by pushing utilization closer to 100% and lowering requests to better match real demand.

However, HPA uses utilization as a signal. If the HPA target is 50% and utilization suddenly jumps from 40% to 85% because VPA reduced requests, HPA interprets this as “over target → add replicas,” even if absolute CPU or memory usage hasn’t actually increased.

The Result: surprise replica growth, noisy oscillations, and higher costs right after a VPA rollout.

A few practical issues tend to surface when combining VPA and HPA in production. Real-world gotchas include:

Multi-container pods (sidecars) inflate or dilute pod-level signals.
Per-container HPAs (ContainerResource) behave differently than pod-level (Resource).
GitOps controllers (e.g., ArgoCD autosync) revert any tuning you try to apply.

The nOps Approach: Multidimensional Pod Autoscaling, Phase 1

We extend our Container Rightsizing (VPA-based) engine to observe the same metric HPA uses and recompute HPA targets to match the new, lower requests.

Here’s the core idea:

				
					New HPA Threshold (%) = ceil( Maximum Usage / Rightsized Request * 100 )

Pod-level metrics (Resource): we sum the max usage and the rightsized requests across the containers HPA considers.
Per-container metrics (ContainerResource): we compute per named container, only for that container’s usage/request.

When rightsizing updates requests, nOps automatically writes the corresponding HPA target (e.g., from 50% → 93%) so HPA only scales when true demand grows—not when you’ve simply gotten more efficient.

If you disable rightsizing, we restore the original HPA targets from recorded annotations.

What's New: HPA Threshold Alignment (CPU & Memory, Utilization target)

We’ve introduced HPA Threshold Alignment for CPU and memory utilization targets — a change designed to make VPAs and HPAs work together more intelligently.

The benefits include:

Prevents cost inflation: avoids accidental scale-outs caused by efficient requests.
Saves engineering time: no more patching HPAs across dozens of services.
Improves stability: reduces ping-pong between VPA savings and HPA scale-outs.

Scope: CPU/Memory Utilization only. AverageValue, custom/external metrics (e.g., queue length), and non-resource signals are not adjusted.

How It Works

Detect eligible HPAs. nOps finds HPAs that scale the workload on CPU or memory utilization (pod-level or per-container).
Analyze usage vs. requests. For each workload (and container, if applicable), nOps compares historical max usage to the rightsized request.
Recalculate the HPA threshold. New Target (%) = ceil(Max Usage / Rightsized Request * 100)
Apply the target automatically. We update the HPA’s averageUtilization and store the original in annotations for rollback.
Revert on disable. Turn off Container Rightsizing and nOps restores the original thresholds.

How To Turn It On (Helm)

				
					helm upgrade --install nops-agent nops/nops-agent \
  --namespace nops-system --create-namespace \
  ... \
  --set vpa.recommender.extraArgs.enable-hpa-adjustment=true

Already using Container Rightsizing? This simply adds HPA alignment on top – no app changes required.

A quick look at HPA YAMLs

Pod-level Resource metric (CPU):

				
					apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50   # nOps may update this (e.g., -> 93)

Per-container ContainerResource metric (CPU):

				
					apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  metrics:
    - type: ContainerResource
      containerResource:
        name: cpu
        container: app
        target:
          type: Utilization
          averageUtilization: 60   # nOps may update this for container "app"

DIY (Manual) Option

Prefer to manage thresholds yourself (or only for selected services)? Use the same formula:

				
					apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  metrics:
    - type: ContainerResource
      containerResource:
        name: cpu
        container: app
        target:
          type: Utilization
          averageUtilization: 60   # nOps may update this for container "app"

Patch a pod-level Resource metric:

				
					kubectl patch hpa web-hpa -n default --type='json' \
  -p='[
    {"op":"replace","path":"/spec/metrics/0/resource/target/averageUtilization","value":180}
  ]'

Patch a per-container ContainerResource metric:

				
					kubectl patch hpa web-hpa -n default --type='json' \
  -p='[
    {"op":"replace","path":"/spec/metrics/0/containerResource/target/averageUtilization","value":180}
  ]'

Adjust the /spec/metrics/<index> to match the metric you’re targeting. For memory, change name: cpu → memory.

Where to get inputs:

Maximum Usage: Container Rightsizing details modal (red line, last 30 days).
Rightsized Request: Container Rightsizing dashboard (recommended CPU/memory).

Real-world examples

API service (Deployment) with CPU HPA @ 50%

VPA lowers requests by ~2×; utilization jumps to 95%. Without alignment, HPA doubles replicas.

With nOps: HPA target recalculated to ~93%. Replicas remain stable. Cost drops; SLOs unchanged.

Pipeline worker with per-container HPA

App container is the signal; sidecar shouldn’t influence scaling.

With nOps: We adjust only the app container’s target (ContainerResource), preserving intended behavior.

Mixed Sidecars

Pod-level HPA sums signals across containers with non-zero requests.

nOps uses the same accounting so targets match what HPA truly sees.

GitOps with ArgoCD (autosync)

If HPAs are managed by ArgoCD autosync, Argo will revert nOps’ edits.

Option 1: Disable autosync for HPAs you want nOps to manage.
Option 2: Add ignoreDifferences for the target fields:

				
					spec:
  ignoreDifferences:
    - group: autoscaling
      kind: HorizontalPodAutoscaler
      jsonPointers:
        - /spec/metrics/0/resource/target/averageUtilization
        - /spec/metrics/0/containerResource/target/averageUtilization
        # add more indices if you have multiple metrics

Why this improves on “VPA-only” or “HPA-only”

Improvements include:

Fewer surprise scale-outs: HPA responds to true demand changes, not just better efficiency.
Predictable stability: Less oscillation = steadier replicas and latency.
Cluster-wide savings: Lower requests → tighter bin-packing → fewer nodes → lower bill – without replica inflation.

Operational notes & best practices

Keep in mind:

Supported signals: CPU/Memory Utilization targets only. AverageValue and custom/external metrics are out of scope for Phase 1.
Restore anytime: We annotate original targets and can revert when you disable rightsizing.
Multi-container math: For pod-level metrics, we compute against the same set HPA uses (containers with non-zero requests). For per-container metrics, we compute per the named container.
GitOps guardrails: If a controller owns the HPA, configure it to tolerate the fields nOps touches.

Current limitations (Phase 1)

Here are the current limitations of Phase 1:

No automatic alignment for AverageValue, custom, or external metrics.
We don’t create or remove metrics – only adjust existing CPU/Memory Utilization targets.
For extreme outliers, you may still want SLO-aware buffers (e.g., cap targets ≤ 200–300%).

What’s Next (towards full MPA)

In phase 2, nOps will suggest and (optionally) create/manage HPAs for workloads that don’t have one.

Why: Many services are rightsized but never gain elastic savings.
How: Detect eligible Deployments/StatefulSets → propose CPU/Memory Utilization targets using ceil(max_usage / rightsized_request * 100) → set sane min/max replicas and stabilization → keep targets aligned as requests evolve.
Guardrails: Dry-run previews, GitOps-friendly YAML export, namespace allow-lists, rollback.
Scope: CPU/Memory Utilization metrics (no custom metrics yet).

Together, Phases 1 and 2 evolve Compute Copilot into a Multidimensional Pod Autoscaler: rightsized containers, the right HPA targets, and the right signals – continuously aligned on autopilot.

The Bottom Line

You shouldn’t have to choose between VPA savings and HPA stability. With Multidimensional Pod Autoscaling (Phase 1), nOps keeps them in lockstep: rightsized requests and right-sized HPA targets—on autopilot.

How to Get Started

If you're already on nOps...

Enable HPA alignment in your nOps K8S Helm deployment:

				
					helm upgrade --install nops-agent nops/nops-agent \
  --namespace nops-system --create-namespace \
  ... \
  --set vpa.recommender.extraArgs.enable-hpa-adjustment=true

Need help? Ping your CSM or visit the Help Center.

If you’re new to nOps…

nOps was recently ranked #1 with five stars in G2’s cloud cost management category, and we optimize $2+ billion in cloud spend for our customers.

Join our customers using nOps to understand your cloud costs and leverage automation with complete confidence by booking a demo with one of our AWS experts.

Table of Contents