- Blog
- Right Sizing
- Boost Your AWS Infrastructure Performance — and Lower Your AWS Costs — by Detecting Low Network Utilization
Boost Your AWS Infrastructure Performance — and Lower Your AWS Costs — by Detecting Low Network Utilization
CPU utilization gets the most attention in rightsizing discussions. An instance running at 5% CPU is obviously oversized. But CPU alone misses a category of waste that network metrics expose more clearly: instances that are not just oversized, but entirely idle — receiving no connections, processing no requests, and generating no outbound traffic.
AWS Compute Optimizer now explicitly identifies EC2 Auto Scaling groups that “demonstrate consistently low CPU and network usage throughout the lookback period as idle,” recommending they be scaled down. This signals a shift in how AWS itself defines waste: network activity, alongside CPU, is a primary indicator of whether compute resources are delivering value.
This guide covers how to use low network utilization as a diagnostic signal for idle and oversized EC2 instances, how to detect it using native AWS tools, and how to translate that signal into rightsizing and termination decisions.
What Low Network Utilization Tells You About an Instance
Every EC2 instance that serves a purpose communicates with something. Web servers receive HTTP requests and return responses. Application servers pull messages from queues, call databases, and post results. Batch workers fetch input data, process it, and write output. Even monitoring agents generate outbound telemetry traffic.
When an instance shows near-zero network I/O over a sustained period (14+ days), one of three things is true:
The instance is genuinely idle. It was provisioned for a project that ended, a service that was decommissioned, or a test that completed. Nobody remembered to terminate it. This is the most common scenario — and the easiest to act on.
The instance handles infrequent scheduled work. It activates briefly for weekly batch jobs, monthly reports, or periodic data exports, then sits idle between runs. The network metric over 14 days may show 13.5 days of zero activity with a brief spike. This requires a different optimization path (scheduling or serverless migration) rather than termination.
The instance is over-provisioned for its actual workload. It runs a lightweight service that generates minimal network traffic — perhaps a configuration server, a DNS relay, or an internal tool with single-digit daily users. The instance type was selected for a predicted load that never materialized. This is a rightsizing candidate, not a termination candidate.
Network utilization distinguishes between these scenarios more reliably than CPU alone. An instance running a lightweight cron daemon may show 8-12% CPU (keeping it above typical “idle” thresholds) while generating zero network activity — revealing that nothing external depends on it.
How to Detect Low Network Utilization
Here are the key strategies:
CloudWatch NetworkIn and NetworkOut Metrics
Amazon CloudWatch tracks two primary network metrics for every EC2 instance:
- `NetworkIn` — bytes received by the instance across all network interfaces
- `NetworkOut` — bytes sent by the instance across all network interfaces
These metrics report at 5-minute intervals by default (1-minute with detailed monitoring enabled). To identify idle instances, query the maximum value over a 14-day window:
If the maximum `NetworkIn` + `NetworkOut` across any 5-minute period in two weeks stays below 1 MB, the instance effectively received and sent nothing meaningful during that entire period. It is either idle or running a purely local process that does not communicate externally.
For rightsizing decisions (rather than termination), the relevant threshold is different. Look for instances where peak network utilization stays below 5% of the instance type’s baseline bandwidth. A `c5.xlarge` provides 10 Gbps baseline — if peak throughput never exceeds 50 Mbps, the instance could likely run the same workload on a `t3.medium` with 5 Gbps burst capability at roughly 70% lower cost.
AWS Trusted Advisor Low Utilization Check
AWS Trusted Advisor’s cost optimization checks include an “Amazon EC2 instances with low utilization” check that examines both CPU and network metrics. An instance is flagged when:
- Average CPU utilization is 10% or less AND
- Network I/O is 5 MB or less on 4 or more days during the previous 14 days
This combined threshold is conservative — it catches only the most clearly idle instances. Many teams find that raising the network threshold to 50 MB or extending the observation window to 30 days identifies significantly more waste.
Trusted Advisor provides estimated monthly savings per flagged instance, making it straightforward to prioritize: start with the largest instances showing the lowest utilization.
AWS Compute Optimizer Idle Detection
AWS Compute Optimizer takes a more sophisticated approach. Rather than applying fixed thresholds, it analyzes utilization patterns over a 14-day lookback period (extendable to 93 days with enhanced recommendations) and classifies instances as:
- Over-provisioned — running below optimal utilization; smaller instance types recommended
- Under-provisioned — hitting capacity limits; larger instance types recommended
- Idle — consistently low CPU AND network usage throughout the entire lookback period
The January 2025 expansion to Auto Scaling groups is particularly relevant. Previously, Compute Optimizer only flagged individual instances. Now it identifies entire Auto Scaling groups with consistently low CPU and network usage — catching cases where a minimum capacity of 2-3 instances is maintained for a service nobody uses.
VPC Flow Logs for Connection-Level Analysis
When CloudWatch metrics show borderline utilization, VPC Flow Logs provide deeper context. Flow Logs record individual network connections — source IP, destination IP, port, protocol, and byte count — giving you visibility into exactly what is communicating with an instance and how often.
This level of detail answers questions that aggregate metrics cannot:
- Is the traffic coming from a monitoring system (health checks) or from actual users/services?
- Is one client responsible for all the traffic, or is it distributed?
- Are the connections inbound (something depends on this instance) or outbound (the instance depends on something else)?
An instance showing 10 MB/day of network traffic might appear “in use” based on CloudWatch alone. Flow Logs might reveal that 100% of that traffic is health checks from an ALB target group — the instance is registered as a target but receiving no actual application traffic.
Why Network Signals Matter More Than CPU for Idle Detection
CPU utilization can be misleading as a sole indicator of value. Several common scenarios produce moderate CPU usage on instances that deliver zero business value:
Monitoring agents and log shippers. CloudWatch Agent, Datadog, New Relic, or Fluentd processes consume 2-8% CPU continuously — even on instances doing nothing else. This baseline CPU noise prevents these instances from hitting the sub-10% threshold that Trusted Advisor uses for its idle flag.
Cron jobs that run locally. A log rotation script, a certificate renewal check, or a health check endpoint that only validates the instance itself — all generate CPU activity without external network communication. The instance is “busy” maintaining itself while producing nothing for the organization.
Zombie processes from failed deployments. A container daemon, application runtime, or database engine that started but never received connections. It consumes CPU for garbage collection, background threads, or periodic internal tasks — but never serves a request.
Network utilization cuts through this noise. If nothing connects to an instance and it sends nothing outbound (beyond minimal DNS lookups and NTP synchronization), the instance is not participating in the infrastructure. The CPU activity is self-referential — maintaining the instance’s own existence rather than serving external consumers.
The practical threshold: instances with less than 5 MB/day combined NetworkIn + NetworkOut over a 14-day window should be investigated. At that level, the only traffic is likely AWS metadata queries, DNS resolution, and monitoring heartbeats — not application workload.
Comparison: Detection Methods for Idle EC2 Instances
Let’s sum it up in a quick table:
Method | What It Detects | Lookback Period | Network Signal Used | Limitations |
CloudWatch Metrics (manual) | Custom thresholds per instance | Configurable (14–90 days) | NetworkIn/Out raw bytes | Requires per-account queries and manual threshold setting |
AWS Trusted Advisor | CPU < 10% and network < 5 MB for 4+ days | 14 days | Combined 5 MB daily threshold | Conservative; misses moderate-but-idle instances |
AWS Compute Optimizer | Holistic idle classification using ML | 14–93 days | CPU + network patterns combined | Requires opt-in; recommendations delayed 24–48 hours |
VPC Flow Logs | Connection-level traffic patterns | Custom retention | Individual flows: source, destination, bytes | Higher cost; requires analysis tooling |
Cost optimization platforms (nOps, etc.) | Cross-account idle detection with context | Continuous | Multiple signals aggregated | Third-party integration required |
Actionable Steps: What to Do When You Find Low Network Utilization
Here is a quick framework for how to take action on low network utilization:
Decision Framework
Not every instance with low network utilization should be terminated. Use this decision tree:
Step 1: Confirm the instance is genuinely idle (not scheduled).
Check CloudTrail for recent API calls originating from the instance’s IAM role. Check CloudWatch for any periodic spikes in CPU or network (batch jobs that run weekly/monthly). If you find periodic activity, the instance is a scheduling optimization candidate — not a termination candidate.
Step 2: Identify what depends on the instance.
Check ALB/NLB target group registrations. Check Route 53 records pointing to the instance’s IP. Check security group rules referencing the instance. If nothing references it, proceed to termination. If references exist but traffic is zero, investigate whether the referencing resources are themselves orphaned.
Step 3: Act based on the scenario.
Scenario | Network Signal | Action |
Truly idle, no dependencies | < 1 MB/day for 14+ days | Snapshot EBS, then terminate the instance |
Idle between scheduled runs | Zero most days, with periodic spikes | Convert to scheduled start/stop or migrate to Lambda/Fargate |
Over-provisioned for workload | Consistent traffic, but < 5% of bandwidth capacity | Rightsize to a smaller instance type |
Health-check-only traffic | Traffic only from ALB/monitoring, with zero app traffic | Remove from target group, verify no impact, then terminate |
Step 4: Validate before permanent termination.
Stop the instance (do not terminate) for 7 days. Monitor for alerts, broken dependencies, or team reports of missing functionality. If nothing surfaces after 7 days, terminate and release associated resources (EBS volumes, Elastic IPs, ENIs).
Implementing Automated Detection
For ongoing detection rather than one-time audits, create CloudWatch Alarms that trigger when instances sustain low network activity:
- Set a CloudWatch Alarm on `NetworkIn` + `NetworkOut` with a threshold of 1 MB over a 24-hour period using the `Sum` statistic
- Use the `INSUFFICIENT_DATA` state (which occurs when no data points exist) as an additional idle signal — an instance that stops reporting metrics entirely is either terminated or disconnected
- Route alarm notifications to a cost optimization Slack channel or ticketing system for human review
- Tag instances with `idle-candidate: true` via Lambda when alarms trigger, creating a reviewable queue that accumulates over time
The tagging approach is particularly effective because it separates detection from action. Engineers can review tagged instances in batches during dedicated cost optimization sessions rather than responding to each individual alert. After two consecutive audit cycles (typically monthly) where a tagged instance remains idle, the confidence level is high enough for termination.
For organizations with multiple accounts, AWS Organizations combined with Compute Optimizer’s organization-level view aggregates idle recommendations across all member accounts — surfacing the total waste picture rather than requiring per-account analysis.
Eliminate waste and pay less with nOps
Across the strategies in this guide, low network utilization is a powerful signal for uncovering idle or oversized resources. But eliminating idle infrastructure is only one part of a broader cloud cost strategy. For many teams, commitment optimization remains one of the largest savings levers — and nOps focuses on maximizing that lever automatically, increasing your effective savings rate without adding operational overhead. And, we only get paid after delivering you measurable savings.
In 2026, “good enough” means you’re likely leaving money on the table. We’ve talked to companies that can save millions on their cloud bills by switching to nOps from competitors.
There’s no risk to book a free savings analysis to find out if nOps can help you get more value out of your cloud investments.
nOps manages $4B+ in cloud spend and was recently rated #1 in G2’s Cloud Cost Management category.
Demo
AI-Powered Cost Management Platform
Discover how much you can save in just 10 minutes!