Auditing your AWS bill: a repeatable monthly routine
A 30-minute monthly ritual that catches waste before it compounds.
For two years my AWS bill audit was an annual panic when finance asked why spend had crept up 40%. The problem with annual is that by then the cause is buried under twelve months of changes. Switching to a tight monthly routine turned a forensic investigation into a 30-minute checklist, and it has caught real money every month.
This is the exact routine I run on the first business day of each month. It is deliberately repeatable so it survives me being busy.
Step 1: Diff month-over-month by service
I start in Cost Explorer grouped by service, comparing the last full month to the one before. I am not looking at absolute size; I am looking at deltas. A service that jumped is the lead to follow.
aws ce get-cost-and-usage \
--time-period Start=2026-05-01,End=2026-06-01 \
--granularity MONTHLY \
--metrics UnblendedCost \
--group-by Type=DIMENSION,Key=SERVICE \
--query 'ResultsByTime[0].Groups[?Metrics.UnblendedCost.Amount > `500`]'
Anything that moved more than a few hundred dollars gets a note. Most months one or two services explain the entire change.
Step 2: Hunt the usual waste
Before going deep on the deltas, I sweep for the recurring offenders. These accumulate quietly:
- Unattached EBS volumes and old snapshots still billing after instances are gone.
- Idle load balancers and NAT gateways from torn-down environments; a NAT gateway alone is ~$32/month plus data processing.
- Old EBS gp2 volumes that should be gp3 (about 20% cheaper for the same baseline).
- Orphaned Elastic IPs, which bill when not associated.
- Over-provisioned RDS instances flagged by Compute Optimizer.
# Unattached EBS volumes and their size
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[].{ID:VolumeId,GiB:Size,AZ:AvailabilityZone}' \
--output table
Step 3: Check commitment coverage
Savings Plans and Reserved Instances are where the largest savings hide, and where over-commitment hides the largest waste. I check two numbers in Cost Explorer:
| Metric | Target | If off |
|---|---|---|
| Coverage | ~80-90% of steady-state compute | Buy more commitment for the stable base |
| Utilization | > 95% | Over-committed; let some expire |
Cover the floor, not the ceiling. Commit only to the baseline you are confident persists for the term, and leave spiky or experimental capacity on-demand. An unused Savings Plan is pure loss.
Step 4: Tag accountability and anomalies
I group the same month by cost allocation tag (team, environment) so each delta has an owner I can ask. Untagged spend above a threshold is itself a finding: it means something was provisioned outside our IaC.
I also confirm AWS Cost Anomaly Detection is configured with a monitor per major service and an SNS alert. That makes the next month's audit shorter, because genuine spikes page me when they happen rather than waiting for the monthly review.
aws ce get-anomalies \
--date-interval StartDate=2026-05-01,EndDate=2026-06-01 \
--query 'Anomalies[].{Service:RootCauses[0].Service,Impact:Impact.TotalImpact}'
Step 5: Write down what changed
The routine ends with three lines in a running doc: what moved, why, and what I did about it. This log is what makes the audit cumulative instead of repetitive. When finance asks about a trend, the answer is already written.
Takeaways
- Audit monthly, not annually; deltas are traceable while the change is recent.
- Lead with month-over-month service deltas, then sweep for recurring waste like unattached EBS, idle NAT gateways, and gp2 volumes.
- Track Savings Plan coverage and utilization separately, and commit only to the steady-state floor.
- Use cost allocation tags to assign every delta an owner, and keep a running log so the audit compounds.