The Ops Team’s cloud cost battle station
From the engineers in your AWS console to the executives overseeing your growth and trajectory, building and growing a complex cloud infrastructure involves stakeholders all across your organization. Ensuring that those stakeholders are equipped to make informed and deliberate decisions about your cloud requires getting the right data in front of them—but when it comes to building the perfect dashboard of cost and usage KPIs, every team will have different needs.
Earlier this month, we showed you a dashboard that we’ve seen serve as an excellent starting point for finance teams. Today, we’ll drill into a different dashboard: Ops. Read on to see how each widget on this example dashboard can help answer questions and drive decisions central to your Ops Team’s priorities.
The KPIs: Infrastructure breakdowns and usage patterns
As an Ops Team, your day-to-day priorities will likely consist of staying on top of your infrastructure makeup and usage trends. That’s why your ideal dashboard will pay close attention to service breakdowns, hourly instance counts, and underutilized instances.
All services usage hours, last 30 days and previous period: With this widget, you can tell at a glance how your total usage over the past 30 days compares to the month before—and whether it has increased, stayed the same, or gone down. This is useful for identifying unplanned usage upticks or downticks, and ensuring that your usage hours are trending in the way you’d like.
EC2 usage hours by instance family, last 7 days: This widget lets you see how your EC2 usage from the past week broke down by instance family. Reference this chart in order to identify which instance families are accruing the most usage—those families are most likely to be sources of some over-provisioning!
EC2 hourly elasticity by day, last 7 days: Are you taking advantage of elasticity, or running your cloud like a datacenter? Check this widget to see whether things are being turned off and scaled down during nights, weekends, and periods of light traffic. If not, you’ve got some processes to put in place.
EC2 average running instances per hour, last 30 days: Did you have the same number of average running instances per hour on Sundays as you do on Mondays? That might indicate that you have some dev or test instances that are being left on over the weekend—and chances are, they aren’t being used. Check out how your day-to-day running instance averages compare, and dig into opportunities to scale back.
All services usage hours by instance type, last 30 days: This widget can help you drill even deeper into the makeup of your infrastructure by displaying your past month’s usage broken down by specific instance type. Referencing this chart can quickly tell you which instance types are accountable for most of your usage. They might be good candidates for Reserved Instances!
All services usage hours by region, last 30 days: You can reference this widget to visualize how your infrastructure spans across AWS regions—or whether it doesn’t.
Heavily underutilized instances, last 7 days: Customize your definition of “heavily underutilized,” then see which of your instances are barely being used, and can likely be switched off.
Underutilized compute and general purpose instances, last 7 days: Whereas the previous widget aimed to identify instances that were hardly seeing any usage at all, this widget takes a closer look at general purpose and compute optimized instances to find instances that might be seeing some usage, but might not be utilized enough to warrant their size. Any of the instances that you see here are likely candidates for being downsized, or should perhaps be changed to a better-optimized family that serves their purpose better.
EC2 legacy instances, last 30 days, vs. non-legacy instances, last 30 days: Chances are, you can save some money and gain some power by migrating from legacy instances to non-legacy instances. See how much of your infrastructure is comprised of legacy instances and track your migration efforts by keeping these two tables side-by-side.
Instance distribution by days alive: This widget lets you visualize the age of your fleet and how long your instances have been running, whether for one day or for years. This can be useful for getting an idea of the general age of your infrastructure, and also enables you to pinpoint how many of the servers you’re currently using were turned on at any given point in time.
Customize your station
Setting up a cloud cost battle station for your own Ops Team is the key to ensuring that your infrastructure gets managed effectively—and cost-efficiently. And getting one set up is easy to do. Simply log into Cloudability and try out building as many of the above widgets that you’d like, customized to your needs and populated with your own cost data. Want to get started with this dashboard right away? Shoot us an in-app message to have our Customer Success team add it to your account for you.
Don’t have a Cloudability account yet? Sign up for a free 14-day trial to get started building your own cloud cost battle station today.