6 steps we took to regain control of our Azure spend at SquaredUp

In this write-up of our special webinar, we'll give you an insider’s look into how we took control of our Azure costs, and how you can do the same. We’ll take you through how we identified the problems and implemented lasting solutions to keep things under control. You’ll get to hear from our key stakeholders involved - Finance, Management, IT, Engineering.

Executive summary:

Watch the webinar here:

 

Background: Our Azure spend

At SquaredUp we use Azure extensively – for our internal applications, development tools, test environments, automation suites, customer demo systems, and the list goes on. Over time, we found more and more ways we could benefit from Azure, and it’s been great for the business. But as time went on, costs began snowballing, and we knew we needed to take action.

 

Understanding the Problem

Kirstie, Finance Controller:

“Over the course of a year, Azure costs became the largest IT software spend we have. By the beginning of 2020 we were spending over $5,000 per year per technical member of staff and the cost was still rising. For SquaredUp, that adds up to a large six figure sum per year – a big impact for a company our size.”

If this trend continued, we were going to be way over budget.

Simon, Business Platform Manager:

“So, after Kirstie’s warning we started digging into Azure Cost Management in the Azure Portal.  The good thing was that the way we organise our subscriptions means we could easily see spend by department.  But the problem was that I had no visibility over the details of what they were doing in Azure and if their spend was justified. 

I spoke to each department head, but was alarmed to find out that they also didn’t know!  They didn’t even have a good sense of what they were spending in total, let alone how it broke down.  In theory that information was there, in practice they weren’t finding it.

We had thousands of resources of different types, and we couldn’t easily tell whether these resources were actually being used.  It often wasn’t even clear who to ask - because amazingly you can’t see in the portal who created a resource.”

 

Getting on top if it – some quick wins

So both Finance and IT lacked the necessary visibility over our Azure spend. How did we go about fixing these problems?

 

Step 1: Identifying quick wins – big tickets and zombie resources

Adam, VP Customer Success (formerly Technical Services Manager):

“Around the period when we were diagnosing our Azure spend issues, we had just released a new version of our dashboarding product, SquaredUp for Azure v4.5 – and handily, it included new visualizations for costs.  So, as step 1 of our ‘6 key steps to regain control over our Azure spend’, I made a couple of dashboards.

I used the Treemap and Sunburst visualizations to identify the resources with the highest spend overall in each department. Also by comparing cost trends with live performance metrics using Top N Line Graph visualizations, I was able to spot zombie resources in each department that had little or no usage – but were costing a lot of money.

In our case the big costs came from a few very high spec VMs, quite a few zombie VMs, a couple of large Log Workspaces, and then some premium settings on certain resources such as IPS rules.  We use a lot of platform as a service too for example a lot of storage accounts and app services, but those seemed to be in general more under control.”

 

Step 2: Drilling in – and turning things off

Simon, Business Platform Manager:

"The next step was to drill into the data and turn things off. I assigned one of my Systems Engineers, Shas, to look into the quick wins that Adam’s dashboards identified.  Some were very obvious – for example we had a lot of VMs that had been provisioned and were clearly no longer being used – we could check that with one click in the SquaredUp dashboard, and with another two clicks, stop them running.   Others were harder, and involved tracking down the individuals in each team who were using the resources in question. One of these individuals was Dave, one of our Senior Software Engineers."

 

Dave, Senior Software Engineer:

“The Azure Portal doesn't show the cost of resources clearly when provisioning, especially not with resources like Logic Apps or Log Analytics workspaces. When our Systems Engineer approached me about the cost of some the more expensive resources I soon realised I could tweak a few settings and turn off unnecessary features that I'd been testing and quickly save a chunk of the cost without sacrificing any performance.”

 

After a few days we had already identified a lot of savings and things were beginning to get under control.  But continuing to chase every individual resource in the long tail of less expensive resources was going to take too long. We needed to start empowering the teams.

 

Staying on top of it – automating, tagging, empowering the teams

We implemented 4 key things to ensure our costs stayed under control, and to empower the different teams to stay on top of the long tail of costs for smaller resources.  These 4 key things make up steps 3 to 6:

 

Step 3: VM power automation

Adam, VP Customer Success:

“We already have scripts that turn non-production VMs on and off based on a schedule by storing settings in tags against those VMs. We were able to extend control of this functionality via Slack and MS PowerApps. Not everyone has access to the Azure Portal, but they can still control their own server if they need to via simple Slack slash commands.”

 

Step 4: VM owner tagging

Adam, VP Customer Success:

"Shas wrote an Azure Logic App that tagged all newly created resources with two owner values (the person who created it and the team responsible for it) and this helped us to scope dashboards for cost/monitoring/status to quite specific areas. These tags could also play a future role in automating clean-up e.g. destroy everything at a certain age, or even a Slack command to destroy everything at will where it’s owned by the Slack user. This solved the problem we had of not knowing who to go to when looking into a potential saving."

(Leave your email in the form at the bottom of the page if you'd like the Azure Logic App.)



"In our SquaredUp dashboard we can now pull up a sunburst or treemap visualization of the costs of resources created by any particular user or team. We can also immediately see in the perspective who created the resource when drilling down anywhere in the SquaredUp dashboards, so we always know who to talk to about a cost issue."

 

Step 5: Cost tiles in every dashboard

Simon, Business Platform Manager:

"We started putting cost tiles into the dashboards we use to monitor each of our main services, so that the operational teams would start thinking about costs whenever they looked into performance of their services.

 

Step 6: Wallboards

Simon, Business Platform Manager:

"We added a full cost dashboard into the wallboards of the main teams using Azure showing the trend in costs for their overall area and how it broke down. So wherever they sit in the office the teams can see prominently displayed not just their performance metrics but the costs that go with them and if these are trending up or down. People don’t like to see the trend going up, and feel much more empowered to investigate why when it does."

 

To recap, the 6 steps are:

  1. Identifying quick wins using Sunburst, Treemap and Top ‘N’ Line graph visualizations
  2. Drilling in – and turning things off
  3. Automation
  4. VM owner tagging
  5. Cost tiles in dashboards
  6. Wallboards everywhere

 

Where have these 6 steps gotten us?

In the period of time you see on the chart below, we’ve doubled the size of our engineering team, created new products and new releases, but managed to keep our costs really stable. The main reason is that the development team understands the impact of what they’re doing and are seeing it easily everyday.

 

Q&A

Why wasn’t Azure Cost Management good enough to figure out what’s going on?

Adam: Scoping dashboards across multiple tenants is tricky in Azure Cost Management, especially when getting idea of what user/team is doing in two different tenants. We’re multi-tenant and we needed a one stop shop for good dashboards that cover everything.

Find out more about the difference between Azure Cost Management v. SquaredUp Cost Tile here.

 

From the graphs, it looks like SquaredUp’s VM usage has dropped a lot. Any insights there?

Adam: Using this process we identified how the most expensive resources were being used, and there were a few zombies we could get rid of.

We also consolidated some low usage machines that were being used by individual users into teams.

Comparing performance alongside cost helped us figure out whether we were best using our resources – in some cases they were just scoped too large.

 

Can we share the Azure tagging logic systems app?

Adam: Our Systems Engineer Shas is writing it up soon, it’ll be on our blog in the next couple weeks.

Leave your email below to get notified when it's published.

loading...

For more tips on how to stay on top of Azure costs, take a look at Adam’s ’10 tips for managing Azure costs’ based on his 10 years’ experience of administering Azure environments.

Find out more about how SquaredUp for Azure can help you get a grip on your Azure costs or download the 30-day free trial of SquaredUp for Azure to try it out yourself.