Boost Your AWS Infrastructure Performance — and Lower Your AWS Costs — by Detecting Low Network Utilization

CPU utilization gets the most attention in rightsizing discussions. An instance running at 5% CPU is obviously oversized. But CPU alone misses a category of waste that network metrics expose more clearly: instances that are not just oversized, but entirely idle — receiving no connections, processing no requests, and generating no outbound traffic.

AWS Compute Optimizer now explicitly identifies EC2 Auto Scaling groups that “demonstrate consistently low CPU and network usage throughout the lookback period as idle,” recommending they be scaled down. This signals a shift in how AWS itself defines waste: network activity, alongside CPU, is a primary indicator of whether compute resources are delivering value.

This guide covers how to use low network utilization as a diagnostic signal for idle and oversized EC2 instances, how to detect it using native AWS tools, and how to translate that signal into rightsizing and termination decisions.

What Low Network Utilization Tells You About an Instance

Every EC2 instance that serves a purpose communicates with something. Web servers receive HTTP requests and return responses. Application servers pull messages from queues, call databases, and post results. Batch workers fetch input data, process it, and write output. Even monitoring agents generate outbound telemetry traffic.

When an instance shows near-zero network I/O over a sustained period (14+ days), one of three things is true:

The instance is genuinely idle. It was provisioned for a project that ended, a service that was decommissioned, or a test that completed. Nobody remembered to terminate it. This is the most common scenario — and the easiest to act on.

The instance handles infrequent scheduled work. It activates briefly for weekly batch jobs, monthly reports, or periodic data exports, then sits idle between runs. The network metric over 14 days may show 13.5 days of zero activity with a brief spike. This requires a different optimization path (scheduling or serverless migration) rather than termination.

The instance is over-provisioned for its actual workload. It runs a lightweight service that generates minimal network traffic — perhaps a configuration server, a DNS relay, or an internal tool with single-digit daily users. The instance type was selected for a predicted load that never materialized. This is a rightsizing candidate, not a termination candidate.

Network utilization distinguishes between these scenarios more reliably than CPU alone. An instance running a lightweight cron daemon may show 8-12% CPU (keeping it above typical “idle” thresholds) while generating zero network activity — revealing that nothing external depends on it.

How to Detect Low Network Utilization

Here are the key strategies:

CloudWatch NetworkIn and NetworkOut Metrics

Amazon CloudWatch tracks two primary network metrics for every EC2 instance:

`NetworkIn` — bytes received by the instance across all network interfaces
`NetworkOut` — bytes sent by the instance across all network interfaces

These metrics report at 5-minute intervals by default (1-minute with detailed monitoring enabled). To identify idle instances, query the maximum value over a 14-day window:

If the maximum `NetworkIn` + `NetworkOut` across any 5-minute period in two weeks stays below 1 MB, the instance effectively received and sent nothing meaningful during that entire period. It is either idle or running a purely local process that does not communicate externally.

For rightsizing decisions (rather than termination), the relevant threshold is different. Look for instances where peak network utilization stays below 5% of the instance type’s baseline bandwidth. A `c5.xlarge` provides 10 Gbps baseline — if peak throughput never exceeds 50 Mbps, the instance could likely run the same workload on a `t3.medium` with 5 Gbps burst capability at roughly 70% lower cost.

AWS Trusted Advisor Low Utilization Check

AWS Trusted Advisor’s cost optimization checks include an “Amazon EC2 instances with low utilization” check that examines both CPU and network metrics. An instance is flagged when:

Average CPU utilization is 10% or less AND
Network I/O is 5 MB or less on 4 or more days during the previous 14 days

This combined threshold is conservative — it catches only the most clearly idle instances. Many teams find that raising the network threshold to 50 MB or extending the observation window to 30 days identifies significantly more waste.

Trusted Advisor provides estimated monthly savings per flagged instance, making it straightforward to prioritize: start with the largest instances showing the lowest utilization.

AWS Compute Optimizer Idle Detection

AWS Compute Optimizer takes a more sophisticated approach. Rather than applying fixed thresholds, it analyzes utilization patterns over a 14-day lookback period (extendable to 93 days with enhanced recommendations) and classifies instances as:

Over-provisioned — running below optimal utilization; smaller instance types recommended
Under-provisioned — hitting capacity limits; larger instance types recommended
Idle — consistently low CPU AND network usage throughout the entire lookback period

The January 2025 expansion to Auto Scaling groups is particularly relevant. Previously, Compute Optimizer only flagged individual instances. Now it identifies entire Auto Scaling groups with consistently low CPU and network usage — catching cases where a minimum capacity of 2-3 instances is maintained for a service nobody uses.

VPC Flow Logs for Connection-Level Analysis

When CloudWatch metrics show borderline utilization, VPC Flow Logs provide deeper context. Flow Logs record individual network connections — source IP, destination IP, port, protocol, and byte count — giving you visibility into exactly what is communicating with an instance and how often.

This level of detail answers questions that aggregate metrics cannot:

Is the traffic coming from a monitoring system (health checks) or from actual users/services?
Is one client responsible for all the traffic, or is it distributed?
Are the connections inbound (something depends on this instance) or outbound (the instance depends on something else)?

An instance showing 10 MB/day of network traffic might appear “in use” based on CloudWatch alone. Flow Logs might reveal that 100% of that traffic is health checks from an ALB target group — the instance is registered as a target but receiving no actual application traffic.

Why Network Signals Matter More Than CPU for Idle Detection

CPU utilization can be misleading as a sole indicator of value. Several common scenarios produce moderate CPU usage on instances that deliver zero business value:

Monitoring agents and log shippers. CloudWatch Agent, Datadog, New Relic, or Fluentd processes consume 2-8% CPU continuously — even on instances doing nothing else. This baseline CPU noise prevents these instances from hitting the sub-10% threshold that Trusted Advisor uses for its idle flag.

Cron jobs that run locally. A log rotation script, a certificate renewal check, or a health check endpoint that only validates the instance itself — all generate CPU activity without external network communication. The instance is “busy” maintaining itself while producing nothing for the organization.

Zombie processes from failed deployments. A container daemon, application runtime, or database engine that started but never received connections. It consumes CPU for garbage collection, background threads, or periodic internal tasks — but never serves a request.

Network utilization cuts through this noise. If nothing connects to an instance and it sends nothing outbound (beyond minimal DNS lookups and NTP synchronization), the instance is not participating in the infrastructure. The CPU activity is self-referential — maintaining the instance’s own existence rather than serving external consumers.

The practical threshold: instances with less than 5 MB/day combined NetworkIn + NetworkOut over a 14-day window should be investigated. At that level, the only traffic is likely AWS metadata queries, DNS resolution, and monitoring heartbeats — not application workload.

Comparison: Detection Methods for Idle EC2 Instances

Let’s sum it up in a quick table:

Method	What It Detects	Lookback Period	Network Signal Used	Limitations
CloudWatch Metrics (manual)	Custom thresholds per instance	Configurable (14–90 days)	NetworkIn/Out raw bytes	Requires per-account queries and manual threshold setting
AWS Trusted Advisor	CPU < 10% and network < 5 MB for 4+ days	14 days	Combined 5 MB daily threshold	Conservative; misses moderate-but-idle instances
AWS Compute Optimizer	Holistic idle classification using ML	14–93 days	CPU + network patterns combined	Requires opt-in; recommendations delayed 24–48 hours
VPC Flow Logs	Connection-level traffic patterns	Custom retention	Individual flows: source, destination, bytes	Higher cost; requires analysis tooling
Cost optimization platforms (nOps, etc.)	Cross-account idle detection with context	Continuous	Multiple signals aggregated	Third-party integration required

Actionable Steps: What to Do When You Find Low Network Utilization

Here is a quick framework for how to take action on low network utilization:

Decision Framework

Not every instance with low network utilization should be terminated. Use this decision tree:

Step 1: Confirm the instance is genuinely idle (not scheduled).

Check CloudTrail for recent API calls originating from the instance’s IAM role. Check CloudWatch for any periodic spikes in CPU or network (batch jobs that run weekly/monthly). If you find periodic activity, the instance is a scheduling optimization candidate — not a termination candidate.

Step 2: Identify what depends on the instance.

Check ALB/NLB target group registrations. Check Route 53 records pointing to the instance’s IP. Check security group rules referencing the instance. If nothing references it, proceed to termination. If references exist but traffic is zero, investigate whether the referencing resources are themselves orphaned.

Step 3: Act based on the scenario.

Scenario	Network Signal	Action
Truly idle, no dependencies	< 1 MB/day for 14+ days	Snapshot EBS, then terminate the instance
Idle between scheduled runs	Zero most days, with periodic spikes	Convert to scheduled start/stop or migrate to Lambda/Fargate
Over-provisioned for workload	Consistent traffic, but < 5% of bandwidth capacity	Rightsize to a smaller instance type
Health-check-only traffic	Traffic only from ALB/monitoring, with zero app traffic	Remove from target group, verify no impact, then terminate

Step 4: Validate before permanent termination.

Stop the instance (do not terminate) for 7 days. Monitor for alerts, broken dependencies, or team reports of missing functionality. If nothing surfaces after 7 days, terminate and release associated resources (EBS volumes, Elastic IPs, ENIs).

Implementing Automated Detection

For ongoing detection rather than one-time audits, create CloudWatch Alarms that trigger when instances sustain low network activity:

Set a CloudWatch Alarm on `NetworkIn` + `NetworkOut` with a threshold of 1 MB over a 24-hour period using the `Sum` statistic
Use the `INSUFFICIENT_DATA` state (which occurs when no data points exist) as an additional idle signal — an instance that stops reporting metrics entirely is either terminated or disconnected
Route alarm notifications to a cost optimization Slack channel or ticketing system for human review
Tag instances with `idle-candidate: true` via Lambda when alarms trigger, creating a reviewable queue that accumulates over time

The tagging approach is particularly effective because it separates detection from action. Engineers can review tagged instances in batches during dedicated cost optimization sessions rather than responding to each individual alert. After two consecutive audit cycles (typically monthly) where a tagged instance remains idle, the confidence level is high enough for termination.

For organizations with multiple accounts, AWS Organizations combined with Compute Optimizer’s organization-level view aggregates idle recommendations across all member accounts — surfacing the total waste picture rather than requiring per-account analysis.

Eliminate waste and pay less with nOps

Across the strategies in this guide, low network utilization is a powerful signal for uncovering idle or oversized resources. But eliminating idle infrastructure is only one part of a broader cloud cost strategy. For many teams, commitment optimization remains one of the largest savings levers — and nOps focuses on maximizing that lever automatically, increasing your effective savings rate without adding operational overhead. And, we only get paid after delivering you measurable savings.

In 2026, “good enough” means you’re likely leaving money on the table. We’ve talked to companies that can save millions on their cloud bills by switching to nOps from competitors.

There’s no risk to book a free savings analysis to find out if nOps can help you get more value out of your cloud investments.

nOps manages $4B+ in cloud spend and was recently rated #1 in G2’s Cloud Cost Management category.

Demo

AI-Powered Cost Management Platform

Discover how much you can save in just 10 minutes!

Book a Demo

nOps

Last Updated: February 28, 2025, Right Sizing