GCP Flex CUDs: How Flexible Committed Use Discounts Work and How To Optimize Them

Most FinOps teams we talk to have the same complaint about GCP committed use discounts: the traditional resource-based model punishes you for changing your mind. Migrate from N2 to N4 instances and your CUD stops applying. Same thing if you shift a workload to a different region. It’s a discount that works great — right up until your infrastructure evolves, which is roughly always.

In late 2022 Google rolled out Flexible CUDs, a spend-based commitment model that follows your dollars instead of your resource footprint. You commit to a minimum hourly spend, and the discount applies across VM families, regions, and now even GKE and Cloud Run.

But choosing and managing the right discounts can quickly get complex — and the difference between a well-managed CUD portfolio and a poorly-managed one can easily run into six figures annually. Overcommit and you’re writing checks for compute you never used. Undercommit and you’re paying on-demand rates you didn’t have to.

This guide covers how Google Cloud Flexible CUDs actually work under the hood, where they outperform (and underperform) resource-based CUDs, the optimization strategies that separate good FinOps practice from great, and what automated commitment management looks like in 2026.

What Are GCP Flex CUDs?

Compute Flexible Committed Use Discounts (Flex CUDs) are Google’s spend-based commitment instrument for Compute Engine, GKE, and Cloud Run usage. The elevator pitch: instead of promising Google a specific number of vCPUs and GB of memory in one particular region (that’s what resource-based CUDs do), you promise them a dollar amount per hour. Google gives you a flat-rate discount in return, and that discount roams wherever your spend goes.

Compute Flexible CUDs are fully transferable across machine families and regions, and apply across all eligible usage in the same billing account.

Here’s what that looks like in practice:

You commit to a dollar amount per hour, not a specific resource footprint
Discounts apply automatically to eligible usage across your billing account (as long as resources are under the same billing account)
No upfront payment — commitment fees are billed monthly
Cannot be canceled once purchased — the commitment runs for the full term

The discount rates are flat and predictable — no complicated tiering:

Commitment Term	General Purpose & Compute-Optimized	Memory-Optimized (M1–M4)	HPC (H3, H4D)
1 year	Up to 28% Off	No 1-year discount	Up to 17% off
3 years	Up to 46% off	Up to 63% off	Up to 38% off

These rates apply across eligible Compute Engine, GKE (both Standard and Autopilot), and Cloud Run workloads — a single Flex CUD covers all three services.

How Flex CUDs Differ From Resource-Based CUDs

Here’s the side-by-side on compute flexible commitments vs resource-based commitments.

Feature	Resource-Based CUDs	Flex CUDs
Commitment type	Specific vCPUs, memory, GPUs, local SSDs (hardware commitments)	Dollar amount per hour
Scope	Single region, single project (can be shared)	Entire billing account — any region, any project
Machine family	Locked to one family (e.g., N2 in us-central1)	Applies across all eligible families (N1, N2, N4, C2, C4, E2, etc.)
Maximum discount	Up to 55% (70% for memory-optimized)	Up to 46% (63% for memory-optimized, 3-year)
Eligible services	Compute Engine only	Compute Engine + GKE + Cloud Run
Best for	Rock-stable workloads with predictable, unchanging resource needs	Dynamic environments, multi-region deployments, teams adopting new machine series

The tradeoff boils down to: resource-based CUDs squeeze out a few more percentage points on per-unit price, but Flex CUDs don’t break when your infrastructure shifts. And infrastructure always shifts.

Managing CUD forecasting with native GCP tools in the Google Cloud Console requires a lot of stitching together — BigQuery exports, manual analysis, and guesswork. That matches what we hear on sales calls. A CFO we spoke with recently described the pain: “It takes a lot of man hours and a lot of understanding how to do this laddering.” The time spent wrangling CUD portfolios manually is time not spent building product.

Which Services and Resources Qualify for Flex CUDs?

Google keeps expanding the eligibility list, which is good news — but it also requires constant attention and adjustment. Here’s where things stand in 2026:

Compute Engine

General purpose: N1, N2, N2D, N4, N4D, N4A, C3, C3D, C4, C4A, C4D, E2
Compute-optimized: C2, C2D, H3, H4D
Memory-optimized: M1, M2, M3, M4
Storage-optimized: Z3
Includes all machine types, sole-tenant node types, and sole-tenancy premiums
Local SSD disks

Google Kubernetes Engine (GKE)

GKE Standard workloads
GKE Autopilot Pod workload vCPU, memory, and ephemeral storage
Does not cover cluster management fees

Cloud Run

Cloud Run services with instance-based billing: 28% (1-yr) / 46% (3-yr)
Cloud Run jobs and worker pools: 28% (1-yr) / 46% (3-yr)
Cloud Run services with request-based billing: 17% (1-yr and 3-yr)
Cloud Run functions: 17% (1-yr and 3-yr)

What matters most here is the cross-service coverage. Say you move a workload from a fleet of Compute Engine VMs to GKE Autopilot pods — pretty standard modernization move. With resource-based CUDs, your old commitments sit there burning money on resources you’re no longer running. With Flex CUDs, the same dollar commitment covers the new Autopilot spend without any adjustment.

Google Killed the Credit Model — Here's What Changed (2026)

If you bought spend-based CUDs before mid-2025, you lived through Google’s credit-based billing model. It worked like this: commit to $100/hour of on-demand spend, pay $54/hour (46% off), get a $100 credit that offsets usage. It works on paper, but explaining it to your CFO can be a nightmare.

Google fixed this. As of January 2026, all customers are on the multiprice model:

No more credit offsets — discounts apply directly to SKU prices
Discounted prices appear on your bill, making savings immediately visible
Expanded coverage now includes memory-optimized VMs, HPC machine series, and additional Cloud Run SKUs
Simpler FinOps reporting — your committed rate is what you actually pay

The new model lines up better with how AWS shows Savings Plan discounts, which makes multi-cloud FinOps reporting less painful.

5 Strategies That Separate Good CUD Management From Great

Managing a portfolio of flexible CUDs across a real production environment introduces a lot of complexity. The most useful strategies include:

1. Stack Resource-Based and Flex CUDs Together

Don’t pick one — use both. The teams getting the best effective savings rate run a layered approach:

Resource-based CUDs for your most stable workloads — the VMs that haven’t changed region or family in 12+ months. These give you the deepest per-unit discounts (up to 55%, or 70% for memory-optimized).
Flex CUDs for everything else — workloads that shift across regions, teams migrating to newer machine series (N2 → N4, for instance), and any spend on GKE Autopilot or Cloud Run.

Google’s billing engine applies resource-based CUDs first, then applies Flex CUDs to remaining eligible spend. This means you never “double-discount,” and the layering works automatically.

2. Buy Small and Often, Not Big and Rarely

Bulk commitment purchasing can introduce a lot of risk and inflexibility. The chances are high that your workloads will look very different in just a few months. If actual usage drops 20% after a bulk purchase, you eat that waste for the entire term.

Better approach: make smaller, more frequent purchases (weekly or even daily) based on trailing usage. Each purchase covers your most recent baseline. As older commitments roll off, you naturally adjust. Less forecasting pressure, more actual savings.

We hear this constantly in sales conversations. A VP of Engineering put it this way: “We are doing consolidations, we will be shutting down some of our databases…what our usage looks like in six months is going to be completely different.” That uncertainty is exactly why incremental purchasing outperforms bulk — you’re not betting your budget on a single forecast.

3. Use Flex CUDs as Your Migration Safety Net

Here’s a scenario that plays out at every company upgrading infrastructure: you’re running N2 instances with resource-based CUDs. Engineering wants to move to N4 for the 30-40% price-performance improvement. But your CUD commitment is pinned to N2 — every instance you migrate is an instance you’re paying on-demand rates for, while the old CUD sits there burning money on nothing.

Flexible CUDs eliminate this entire category of problem. The commitment follows your spend, not your machine family. Swap N2 for N4, move from us-central1 to us-east1, shift from bare VMs to GKE Autopilot — the discount still applies. Google highlighted exactly this pattern at Cloud Next ’25, showing how Shopify combines GKE compute classes with Flex CUDs so they can safely adopt new hardware with automatic fallback to previous generations. This is particularly key as cloud providers like Google Cloud accelerate the release of new compute types.

4. Stop Ignoring Non-Compute Spend

A surprising number of FinOps teams have solid Compute Engine CUD coverage but zero coverage on GKE Autopilot or Cloud Run. Since Flex CUDs now blanket all three under a single commitment, every uncovered dollar of managed Kubernetes or serverless spend is pure on-demand waste. If your team is modernizing (moving from VMs to containers to serverless), your CUD strategy needs to keep pace.

5. Automate or Fall Behind

Even with the right strategy written down in a Confluence doc, manual commitment management falls apart at scale. Usage drifts daily. Commitments continually expire. The gap between “we know what we should buy” and “we actually bought it on time” is where most teams bleed savings.

A C-Suite executive at a recent prospect put it bluntly: “You would have to be monitoring it on an hourly basis. Otherwise you’re leaving potential savings on the table.”

Automated commitment management closes that gap. Instead of quarterly spreadsheet reviews, an automated system like nOps analyzes usage hourly, purchases incremental commitments, and rebalances as workloads shift — without waiting for a human to click “buy.”

Automate your Google Cloud CUDs with nOps

nOps Commitment Management automates the entire rate optimization lifecycle across GCP (CUD/SUD), AWS and Azure (RI/SP). For Google Cloud specifically, nOps:

Continuously monitors usage patterns across Compute Engine, GKE, and Cloud Run to identify the optimal blend of resource-based and flexible CUDs
Purchases commitments incrementally based on real usage data — not annual forecasts — reducing waste, risk, and the length of commitment windows
Automatically rebalances as workloads shift between regions, machine families, or services
Provides unified visibility across multicloud commitment portfolios, so multi-cloud teams manage all their savings from a single platform
Calculates and maximizes effective savings rate so you can measure actual realized savings versus theoretical maximums

Curious what that looks like in your environment? Book a free savings analysis to find out.

Flex CUDs vs. AWS Savings Plans: The Multi-Cloud View

If you’re running both GCP and AWS (and odds are, you are), you need to understand how these mechanisms compare. They’re structurally similar, but the devil is in the details.

Feature	Google Cloud Flex CUDs	AWS Compute Savings Plans
Commitment basis	Dollar amount per hour	Dollar amount per hour
Cross-region	Yes — billing account-wide	Yes — account-wide
Cross-instance family	Yes	Yes
Cross-service	Yes (Compute Engine + GKE + Cloud Run)	Yes (EC2 + Fargate + Lambda)
Max discount (1-yr)	28%	~29% (varies by region/family)
Max discount (3-yr)	46%	~52% (varies; up to 72% with all upfront RI)
Upfront payment options	No upfront (monthly billing only)	No upfront, partial upfront, or all upfront
Cancellation	Not allowed	Not allowed
Marketplace resale	Not available	Available for some RIs (not Savings Plans)

Structurally, Flex CUDs are Google Cloud’s answer to AWS Compute Savings Plans. AWS offers deeper maximum discounts if you’re willing to pay upfront and choose longer terms (all-upfront 3-year RIs can hit 72%), but Google Cloud’s model is simpler — fewer knobs to turn, broader per-commitment coverage.

The real challenge for multi-cloud teams isn’t picking the right instrument on each provider individually. It’s managing them together as workloads shift between clouds. A commitment you purchased on GCP last quarter might be problematic this quarter if a workload migrated to AWS (or vice versa). Multicloud commitment management solutions can help simplify and reduce manual work if this occurs.

Google Cloud CUD Pitfalls

Some pitfalls of using committed use discounts (CUD)s include:

Committing to Your Peak, Not Your Baseline

Flex CUDs charge you the commitment fee every hour regardless of usage. If you commit to $100/hour but regularly use only $70/hour, you’re wasting $30/hour — that’s $262,800 over a 3-year term.

Not Watching Expiration Dates

If you don’t track expirations and purchase renewals before coverage lapses, you’ll have stretches of on-demand pricing eating into your savings. Flexera found this is one of the most common CUD management failures across GCP customers.

Not Continually Monitoring Active Commitments

Usage changes. Teams rightsize. Workloads migrate. A CUD portfolio that was perfectly sized six months ago can be 30% misaligned today. An engineering manager we spoke with captured the tension well: “Finance prefers the somewhat lower discount to be worth it just for the lower risk… based off of what can change just in the next year.”

You're Paying On-Demand for Databases and Serverless

Plenty of orgs have strong compute CUD coverage but zero coverage on Cloud SQL, BigQuery, or Cloud Run. These services have their own commitment models — BigQuery now offers spend-based CUDs, and Cloud SQL offers resource-based CUDs — though they operate separately from Flex CUDs and don’t share the same cross-service flexibility.

Do Google Cloud Flexible CUDs Better with nOps

In 2026, “good enough” means you’re likely leaving money on the table. The target: capture 90th-percentile savings rates while keeping average commitment lock-in under six months.

With nOps, you can automate your Google Cloud commitment management with:

• Savings-first pricing model: nOps offers a free savings analysis, so you can see exactly how much you could save. Pricing is based on a portion of realized savings, which reduces downside risk.

• Maximize savings on autopilot: nOps continuously adjusts commitments every hour to match real usage, helping customers capture incremental savings that slower optimization approaches can miss. That hourly adjustment is a major reason nOps can drive up to 20% more savings than competing solutions.

• Eliminate commitment risk: nOps shortens commitment windows from years to a fraction of the time, helping customers access the same discounts with far less risk.

We’ve talked to companies that can save millions on their cloud bills by switching to nOps from competitors. There’s no risk to book a free savings analysis to find out if nOps can help you get more value out of your cloud investments.

nOps manages $3B+ in cloud spend and was recently rated #1 in G2’s Cloud Cost Management category.

Demo

AI-Powered Cost Management Platform

Discover how much you can save in just 10 minutes!

Frequently Asked Questions

Let’s dive into a few FAQ about commitment costs, eligible resources and how to purchase commitments.

Can I cancel a GCP Flexible CUD early?

No. Google Cloud does not allow early termination of CUD contracts. Once purchased, you’re committed for the full 1-year or 3-year term. This is why incremental purchasing and accurate baseline analysis are critical.

Do Flexible CUDs stack with Sustained Use Discounts?

Google applies Sustained Use Discounts (SUDs) automatically to on-demand usage. CUD-covered usage doesn’t receive additional SUDs, since the CUD rate already reflects the committed discount. However, any usage beyond your commitment still benefits from SUDs.

How do Flexible CUDs apply across projects?

Flex CUDs apply at the Cloud Billing account level, covering all projects linked to that billing account. Resource-based CUDs, by contrast, default to the purchasing project but can be configured to share across the billing account.

What happens when I adopt a new machine series?

This is one of Flex CUDs’ biggest advantages. Because they’re spend-based, migrating from (say) N2 to N4 instances doesn’t affect your coverage. Spend-based commitment follow your dollars, not your machine family.

Should I choose 1-year or 3-year Flex CUDs?

It depends on your infrastructure stability. Three-year terms for flexible CUDs offer significantly higher discounts (46% vs. 28% for general purpose), but they carry more risk if your workloads change. Many FinOps teams use a mix of 1-year and 3-year flexible CUDs: shorter terms for dynamic workloads and longer terms for their most stable baseline.

Do Flexible CUDs cover software licensing costs?

Flexible CUDs apply to underlying compute usage (vCPU, memory, and eligible services like GKE and Cloud Run). Software license commitments are separate: you can purchase license commitments for applicable premium operating system (OS) licenses.

Table of Contents