February 07, 2018
We're delighted to welcome back Martin Ehrnst, author of the popular, 'How to get your colleagues engaged with SCOM' article, for another special guest blog.
Working at Intility, one of Norway's leading Managed Service Providers and managing a SCOM deployment covering well over 3000 on-prem servers, Martin's recently been wrestling with the challenges of also monitoring resources running in Microsoft Azure, a hybrid cloud scenario familiar to many.
With so many customers facing similar monitoring challenges in the hybrid cloud world and with the future monitoring landscape still looking unclear - frankly, not helped by a lot of churn from Microsoft in this area - we think it's really valuable that Martin's shared his own experiences so far and hopefully the blog provides some useful insights for your own journey ahead.
More broadly speaking, if you are struggling to understand the role of SCOM in hybrid cloud scenarios, or perhaps concerned about the speed of SCOM discovery when rapidly provisioning IaaS resources in Azure or AWS, you'll like our new resource - The Definitive Guide to Monitoring the Hybrid Cloud with Microsoft SCOM.
For now though, take it away Martin...
Welcome to the continuing saga on how to monitor your customers Azure tenants being a service provider. Previously we have covered how to authenticate against Microsoft CSP, using Azure Resource Health API with Powershell and more.
This post is all about connecting the dots. We are far away from finished, but things are moving in this project and at the time of writing, we have two separate projects going. The first one is focused on creating a single pane of glass for all our customers’ workflows. This involves custom coding and management pack development for SCOM. The second one, which this post will cover, is how we have designed each customer tenant and how we plan to use built-in Azure monitoring functionality.
Working for a service provider we need to construct Azure tenants by taking in to account that we are going to manage cloud resources, so using many cloud features makes a lot of sense. The challenge is that we always have to think about how we can integrate with an existing deployment and work with monitoring solutions on premises.
When we first started out this project we looked in-to what had been done before, and most of the examples we found wouldn’t scale to our requirements or used OMS/Log Analytics only. We wanted to use our SCOM environment for alert handling, dashboards and platform health as SCOM is already integrated with customer portals, CMDBs and more. We will discuss more on that later in this blog post.
Things are moving very fast in Azure, we have changed our initial customer tenant setup twice before we found a structure that we believe to be future proof. When a customer signs up for an Azure Subscription, we populate their tenant with a default monitoring resource group and a OMS/Log Analytics workspace (LA). Along with the default LA workspace we add the Azure Activity Log, Web Apps and Office 365 solutions as standard.
For “bread and butter” type Azure Resources, such as compute and web apps we setup the same type of monitoring regime we provide for on-premise resources, but we use alerts in Azure Monitor. This approach works well for Azure Resources which do not have existing, custom Log Analytics solutions and searches to provide health state. This means that VMs deployed using our custom ARM template will also include Monitor Alerts such as “CPU Usage % above 95” and “Web app response time above X”. In conjunction with Azure Monitor we use Azure Resource Health which will provide health state data regardless of resource type, and custom alerts in monitor or Log Analytics.
Below is a (not so detailed) illustration on our default tenant.
We use System Center Operations Manager (SCOM) as our main monitoring platform for operating systems and applications. As SCOM is already integrated with our ticketing system, CMDB and other internal tools it seems reasonable to provide insight to application and workloads running in Azure on the same monitoring platform. That way we can provide a single pane of glass for on premise, hybrid and cloud only workloads.
To get monitoring data in to our on prem SCOM we looked in to two major options:
The official Azure Management pack from Microsoft. The official MP discovery process/adding new tenants cannot be automated. It relies on a GUI where you sign it to the tenant etc. neither does it provide any “umbrella” functionality for companies enrolled in the CSP program.
Daniele Grandini’s Azure/OMS management pack. Daniele’s management packs provide insight to Log Analytics, Azure Backup and Automation, but relies on the official Microsoft MP for initial discovery. Daniele’s management packs focuses on the solutions within the “Monitoring + management” (formerly known as OMS) space in Azure. Since much of the alerting features from OMS/Log Analytics are moving to Azure Monitor, I reached out to Daniele and asked if he had looked in to creating a Management Pack for that. He had looked at it a little, but was also concerned about the rapid changes. Unfortunately this MP is bound to the initial discovery from the official Azure MP. A service provider managing several hundred tenants (and growing) cannot have that limitation. I hope to be able to help Daniele with the upcoming Azure Monitor MP.
Here’s where our problems started. I wanted to discover all our manged tenants automatically. Take advantage of being a CSP we set out to create our own Management Pack(s). I have created one Management Pack for the CSP platform that integrates with the Partner Center API (see example in this blog post) to do the initial discovery. Tenants and subscriptions are populated as objects in SCOM. Further, using a Partner Center Managed Application we can pre-consent access to all managed tenants. That means we can use this applications credentials to authenticate against each of our managed tenants, by-passing the limitation within the official Management Pack. All resources are then created as object with a hosting relationship to resource group, subscription and tenant. Basic monitoring is done through Azure Resource Health API.
Below is a diagram showing the structure of our CSP Management Pack:
Credentials used to authenticate against partner center and the Azure tenants is provided through SCOM RunAs accounts.
Our next step in SCOM and Azure integration is to create an Azure Monitor Management Pack that references the CSP Management Pack. This will provide the enriched monitoring provided by Azure Monitor. Due to many recent changes to the monitor platform I have decided to wait and see where we end up. At the time of writing Azure Monitor have two new alert features in preview and none of their API’s are officially documented – I will come back with examples when I have something tangible.
To provide effective monitoring as a service provider for customers which span on-prem and cloud environments, we recommend the following:
All service providers do their monitoring differently, but hopefully you have gotten some ideas on how you can do yours. Our solution is far from being finished, but I feel we have a structure that is future proof (the modern type of future). Hopefully we can share the SCOM Management Packs later, but feel free to contact with me on specifics. Just remember I cannot share the MP itself at this point in time.
Until further notice, this will be the closing post on how you can do Azure Monitoring as a service provider.
Martin works as a Systems Engineer for Intility, one of Norway's leading enterprise cloud providers and has extensive experience with System Center, Azure and Windows server products.