re:Invent recap: How Cloudability helps HGST run massive workloads on AWS
There were several great speaking sessions at this year’s re:Invent conference that illustrated best cost management practices in action, but few provided such spectacular insight into the unique cost management efforts demanded by massive projects as did MegaRun: Behind the 156,000 Core HPC Run on AWS & On-demand Clusters, a session about the HPC run that broke industry records for size, scale, and power in order to build superpowered sources of solar energy.
In the session, speakers from Cycle Computing and the University of Southern California described the process of planning, building, and managing a distributed queue designed to handle tens of millions of jobs with the purpose of generating and analyzing over 200,000 unique molecule models to find the perfect fit for a new solar technology. They then introduced David Hinz from HGST, who described his own company’s experience working with Cycle Computing to tackle massive workloads
Why take HPC to AWS?
HGST, a leading hard disc drive vendor, does high performance computing to develop more aerial dense, cooling- and power-efficient disc drives.
They didn’t always do their HPC on AWS—but as David says, “in the manufacturing realm it’s all about how you produce products faster, and how you produce products for less money”
By migrating to the cloud over the last fifteen months, David says, HGST has been able to change how their workflow functions. Rather than be forced to limit throughput according to a set amount of computing capacity, he says, “we’re getting more compute than [the scientists] can handle—because they can’t post-process the data and analyze at the rate it’s being generated.”
The right steps along the way
Cloud adoption isn’t as simple as checking off an item on a to-do list; it’s an ongoing process with multiple iterative steps towards a cost- and usage-optimized infrastructure. David describes HGST’s AWS evolution as a five stage process. After starting with the first Proof of Concept, HGST progressed to their first HPC production cluster, then worked to optimize their workloads and flexibility.
Tools for success
Once HGST could boast several C3 deployments and four running Production Clusters, it came time for the final two steps of their process: to lower their costs, and focus on business metrics.
“In a manufacturing environment producing products, we are very focused on making sure we don’t go over our budgets, [that] we’re effectively using our compute and effectively using our dollars,” David says.
Cloudability helps HGST do just that.
Cloudability, David says, helps HGST to “understand how the workloads are running across the cloud space, and how we can effectively optimize that environment” with such features as Budget Alerts, Cost Allocation Reports, the Reserved Instance Planner, and more.
Using the visibility provided by Cloudability, HGST was able to completely change their environment to incorporate RIs and Spot Instances. “You can save anywhere from 11% to 30% by shaping your compute, changing your environment, and making sure you’re using the right compute,” David says. “This is a very powerful tool for our teams to be cost-effective as they move forward.”
With the visibility and insight provided by Cloudability, David knows that the HPC workloads run by HGST are as cost-efficient as they are powerful. But the journey doesn’t stop here.
“We have more workloads that are going to be coming up every day,” David says, “that we’re going to be migrating from on-premise to the cloud.” And by iterating on the steps that have brought him here, David knows he can continue to maintain a lean, cost-efficient infrastructure along the way.
“Using Cloudability [has] been key for us,” he says, “in being successful.”
Want to learn more about how HGST keeps their costs optimized as they do big things? Watch the full session or sign up for a free 14-day trial of Cloudability Pro to try out the tools for yourself today.