Cloudability logo
Registration is Now Open for CloudyCon 2019
Cost Optimization FinOps

Stage III of AWS Cost Efficiency: Optimizing EC2 usage

By J.R. Storment on July 21, 2014
AWS Cost Efficiency Stage 3

The FinOps Journey:

(EDITORIAL NOTE – Annotated in 2019 to show this post’s role on the path to the FinOps cloud operating model.)

For AWS infrastructure, EC2 optimization has always been a crucial step in cloud cost management, but it’s not limited just to EC2 — or just to AWS. Optimization has grown even more important as cloud infrastructures have grown, so much so that it’s an entire phase in the cloud operating model of FinOps. In FinOps, the Optimize phase includes actions applicable to EC2 (such as removing underutilized resources, automating resources or rightsizing instances) and broader optimization actions, including RI purchasing or discount optimization.

To find out more about FinOps and optimization, check out FinOps: A New Approach to Cloud Financial Management.


Welcome back to the Five Stages of AWS Cost Efficiency. Today, we’ll be talking about Stage III: optimizing EC2 usage. Before diving into this step, be sure you’ve ensured basic cost visibility and implemented cost allocation and chargeback. Ready? Let’s get started.

It’s time to stop treating the cloud like a datacenter. There are 168 hours in a week, and 108 of them are nights and weekends. In spite of this, it’s far too common for companies to over-provision resources and leave everything on all the time. The goal of this stage is to avoid this pitfall: to identify what resources can be turned off, sized down, or autoscaled back during non-peak hours. In doing so, you’ll also make some great progress in determining which instances you should buy reservations for in Stage IV.

Here are the ABCs of optimizing EC2 usage efficiency:

A) Let tags guide your way

You should have already implemented tags in Stage II: Cost Allocation and Chargeback. If so, well done— tagging is crucial to this stage. Without tags, you won’t know with confidence what each of your instances is doing, and which instances you can turn off.

For the purposes of Stage III, use a Role tag to tell you if an instance is part of your web, app or database tier. Use a Name tag to concatenate data about the service, node, or cluster it’s a part of—or apply discrete tag keys for each of those for even more granularity.

These tags buy you two distinct efficiency wins:

This is so important that some of our customers implement tag-or-terminate rules that automatically shut down instances not tagged within 24 hours.

B) Look for underutilized instances

One of the quickest efficiency wins is to simply turn off underused instances. Start by looking at a combination of low CPU, low Bandwidth, and low Disk I/O. As mentioned in the last section though, these will vary by the instance role—your database servers will likely use more I/O than your web workers. Here are some of the metrics you should look at:

C) Turn off the lights at night

At least 65% of the hours in a month are nights and weekends. Unless you have around-the-cloud offshore dev teams or are relying exclusively on ephemeral storage, it’s likely you can turn some of your non-production resources off some of the time.

You’ll want to report on hours of the day, filtered to non-production resources via tags,linked accounts or security groups. Here’s a sample Cloudability usage report to get you started.

Chances are, you’ll find your instance count at 2am to be the same as 2pm. Using an orchestration tool like Puppet or Chef, explore autoscaling some of your resources down after your office closes for the day and turning them back on before it opens.

Get started

You can ensure comprehensive completion of Stage III by following these steps:

– Define role-specific utilization SLAs based on profiles

– Generate an underutilized instance report based on CPU, BW, Disk IO + Days Alive using Cloudability Usage Analytics

Identify test/dev/stage resources that don’t need to be running 24/7

– Implement API access to Usage data for Ops / Eng Dashboards

Provide the reports to each product team

Most companies try to skip this stage and go straight to buying Reserved Instances. But by optimizing your EC2 usage first, you’ll put yourself in a select group who can make the most efficient RI buys possible, and save more money along the way. Your finance team will thank you.

To get started optimizing your usage, log in or start a free 14-day trial today.


For more information about the Five Stages of AWS Cost Efficiency, check out these blog posts:

Overview: The Five Stages of AWS Cost Efficiency

Stage I: Basic cost visibility

Stage II:  Cost allocation and chargeback

Stage III: Optimizing EC2 usage

Stage IV: Developing a Reserved Instance purchasing strategy

Stage V: Understanding the business value of increasing cloud spend

Being in the know feels great