Azure monitoring, connecting the dots
Azure Monitoring
Welcome to the continuing saga on how to monitor your customers Azure tenants being a service provider. Previously we have covered how to authenticate against Microsoft CSP, using Azure Resource Health API with Powershell and more.
This post is all about connecting the dots. We are far away from finished, but things are moving in this project and at the time of writing, we have two separate projects going.
The first one is focused on creating a single pane of glass for all our customers’ workflows. This involves custom coding and management pack development for SCOM. The second one, which this post will cover, is how we have designed each customer tenant and how we plan to use built-in Azure monitoring functionality.
Customer tenant setup
Working for a service provider we need to construct Azure tenants by taking in to account that we are going to manage cloud resources, so using many cloud features makes a lot of sense. The challenge ist that we always have to think about how we can integrate with an existing deployment and work with monitoring solutions on premises.
When we first started out this project we looked in-to what have been done before, and most of the examples we found wouldn’t scale to our requirements or used OMS/Log Analytics only. We wanted to use our SCOM environment for alert handling, dashboard and platform health as SCOM is already integrated with customer portals, CMDBs and more. We will discuss more on that later in this blog post.
Things are moving very fast in Azure, we have changed our inital customer tenant setup twice before we found a structure we believe is future friendly.
When a customer sign up for an Azure Subscription, we populate their tenant with a default monitoring resource group and a OMS/Log Analytics workspace (LA). Along with the default LA workspace we add the Azure Activity Log, Web Apps and Office 365 solutions as standard.
For “bread and butter” type of Azure Resources, such as compute and web apps we setup the same type of monitoring regime we provide for on-premise resources, but we use alerts in Azure Monitor. This approach works well for Azure Resources which do not have existing, custom Log Analytics solutions and searches to provide health state. This means that VMs deployed using our custom ARM template will also include Monitor Alerts such as “CPU Usage % above 95” and “Web app response time above x”. In conjunction with Azure Monitor we use Azure Resource Health wich will provide health state data regardless of resource type, and custom alerts in monitor or Log Analytics.
Below is a (not so detailed) illustration on our default tenant.
SCOM and Azure Integration
We use System Center Operations Manager (SCOM) as our main monitoring platform for operating systems and applications. As SCOM is already integrated with our ticketing system, CMDB and other internal tools it seems reasonable to provide insight to application and workloads running in Azure on the same monitoring platform. That way we optimally can provide a single pane of glass in to the on premise, hybrid and cloud only workloads.
Azure Management Packs
To get monitoring data in to our on prem SCOM we looked in to two major options.
Option #1:
The official Azure Management pack from Microsoft. he official MP discovery process/adding new tenants cannot be automated. It relies on a GUI where you sign it to the tenant etc. neither does it provide any “umbrella” functionality for companies enrolled in the CSP program.
Option #2:
Daniele Grandini’s Azure/OMS management pack. Daniele’s management packs provide insight to Log Analytics, Azure Backup and Automation, but relies on the official Microsoft MP for initial discovery. Daniele’s management packs focuses on the solutions within the “Monitoring + management” (formerly known as OMS) space in Azure. Since much of the alerting features from OMS/Log Analytics are moving to Azure Monitor, I reached out to Daniele and asked if he had looked in to creating a management pack for that. He had looked a little in to it, but was also concerned about the rapid changes. Unfortunately this MP is bound to the initial discovery from the official Azure MP. A service provider managing several hundred tenants (and growing) cannot have that limitation. I hope to be able to help Daniele with the upcoming Azure Monitor MP.
Here’s where our problems started. I wanted to discover all our manged tenants automatically. Take advantage of being a CSP we set out to create our own management pack(s). I have create one management pack for the CSP platform that integrate with the Partner Center API (see example in this blog post) to do the initial discovery. Tenants and subscriptions are populated as objects in SCOM. Further, using a Partner Center Managed Application we can pre-consent access to all managed tenants. That means we can use this applications credentials to authenticate against each of our managed tenants, by-passing the limitation within the official management pack. All resources are the created as object with a hosting relationship to resourcegroup, subscription and tenant. Basic monitoring is done through Azure Resource Health API.
Below is a diagram showing the structure of our CSP management pack
Credentials used to authenticate against partner center and the Azure tenants is provided through SCOM RunAs accounts.
Our next step in SCOM and Azure integration is to create an Azure Monitor Management pack that reference the CSP management pack. This will provide the more enriched monitoring provided by Azure Monitor. Due to many recent changes to the monitor platform I have decided to wait and see where we end up. At the time of writing Azure Monitor have two new alert features in preview and none of their API’s are officially documented – i will come back with examples when I have something tangible.
Summary
To provide effective monitoring as a service provider for customers which span on-prem and cloud environments, we recommend the following:
- For “bread and butter” monitoring use a combination of SCOM and Azure Monitor
- If in the CSP program. Create a management pack using CSP rest API’s (hopefully I can share our MP later) combined with a custom Azure Monitor MP
- Not a CSP? Look in to a combination of the official MP and Daniele’s management packs.
- Deploy Log Analytics as default to all tenants. This will give you an advantage when customers require custom solutions and log sources.
Wrapping up
All service providers do their monitoring differently, but hopefully you have gotten some ideas on how you can do yours. Our solution is far from being finished, but I feel we have a structure that are future proof (the modern type of future). Hopefully we can share the SCOM management packs later, but feel free to contact me on specifics. Just remember I cannot share the MP itself at this point in time.
Until further notice, this will be the closing post on how you can do Azure Monitoring as a service provider.
Big thanks to Kevin Green and Cameron Fuller for providing feedback and to reach out to other community friends on my behalf.
4 COMMENTS
Can you be more specific about the content of your article? After reading it, I still have some doubts. Hope you can help me.
Thank you very much for sharing, I learned a lot from your article. Very cool. Thanks. nimabi
[…] In the Microsoft sphere, partners and large enterprises have faced many of the same challenges. If you are a large enterprise, you might be eligible for an Enterprise Agreement.As a partner you can apply to become a (tier 1) Cloud Solution Provider (CSP). The tools provided are are far from good enough. The challenge is that you are still bound to the tenant isolation. If you wanted to have a view of all alerts in Azure Monitor for all your customers. You need to create a tool that authenticate against each individual tenant and retrieve this information. Similar to what I did with SCOM. […]
[…] is good. If Microsoft doesen’t want to use it, should you?I have written and spoken about the use of SCOM as your hub for Azure Monitor, and my opinion hasn’t changed that much. I belive that transition to you a new monitoring […]