Q: How do I get my teammates engaged in SCOM?
A: Disable Email alerting
A: Let me explain what we did.
I had an interesting conversation the other day on how to get people more involved with SCOM. After going through what we have done over the years I figured it’s a topic worth sharing. Even if Operations Manager will be replaced with OMS or another product in the future, it’s still plays a significant role and will be for many more years.
In this post I will go through some of the steps I think have ben a deal breaker in terms of the involvement from application and service owners that have their environment monitored with SCOM. I won’t reproduce all the steps we have done as that will be to environment specific, but I will explain how and why i think you should look in to these topics.
Season SCOM with data from external sources.
By using data from your CMDB as additional properties to servers (and other objects) in SCOM makes a huge difference. I manage a large environment monitoring a lot of different customers with many services and applications running. Knowing which customer this server belongs to, what kind of backup it is running, which patch regime etc. This is data you typically find in a CMDB and by default SCOM is totally unaware of this.
We use a in-house developed CMDB system fully detached from any SCOM environment, but it has an API. By creating a management pack that extends the Windows computer class we now have the following extended properties on all servers monitored by SCOM
- Customer name
- Type (Physical, virtual)
- Environment (Test, dev, production)
- Services (Applications running)
All these properties can be used for almost anything. Group creation etc.
If you haven’t done this already, i strongly recommend connecting with a CMDB.
Replace the existing console.
The only person(s) who needs the SCOM console is the SCOM admin, and the only reasonable solution is to invest in a web-based system. Third party or in-house developed. There are a few commercial products out there, like SquaredUp and Savision, I encourage you to check them out. Below are two screenshots showing the difference between SquaredUp and the local console which should be a reason alone to invest in this.
SCOM Object state dashboard (who uses this?)
SquaredUp default installation showing a windows server object.
This is a hidden gem. SCOM has an agent running on “all” servers in your environment, and this agent can run scripts for you by a click of a button. We have developed a management pack with a few tasks that was requested by my fellow colleagues.
Spend less time logging on to these servers and have the output directly.
Below is an output from the task showing disk free space. It is a simple Powershell script packed in a task targeted windows computers.
A few examples on other tasks
- Add or remove management group
- List local administrators
- top x memory consuming processes
- Restart agent
- Start Windows service
Alert to ticket creation.
If youre not using SCSM, you probably havent got a good connection with your ticketing system or any at all. You can send an Email directly but chances are that it won’t work wery good. Let’s say you have an alert storm and you are sending alerts through a SMTP channel to your ticketing system. You will probably have 100 tickets created without any connection at all to the actual alert. Maybe you have two tickets for each alerts as well, one being resolution state NEW and the other Closed? Thats 200 tickets, or 198 because there are two business critical alerts not resolved but you don’t see it.
With SquaredUp we created a function for ‘on demand’ alert creation directly from the console using their built in functionality and a external script.
In a scenario with an alert storm the operations team can quickly look at their dashboards and see which alerts is still present – not the ones that are already resolved. Creating tickets for these alerts makes sense as they will have to be looked in to further. Below is a diagram showing how we set this up.
Along side with this flow. We update each ticket with a new message when the alert is closed.
Support different alert platforms.
What I mean is that you should try to integrate SCOM so that alerts can be consumed on other platforms. I have blogged earlier on how to post messages to Microsoft Teams and Mattermost. This can also be done with Slack. If you don’t use any of these collaboration tools, think and consult with your colleagues, they Probably have some great ideas!
Stop being personally involved with SCOM alerts.
As a SCOM administrator, how many times have you found your self invastigating an alert not within your field and without notifying anyone else? Probably too many. You’re not going to solve all the alerts and there’s is a reason for the application being monitored in the first place, someone wanted it. Sit down with your team and figure out a solution together.
Big Data and Events
Splunk. OMS, Elasticsearch. It doesn’t matter. If you manage to tie your existing SCOM environment with event and big data systems you will be amazed. Again, the built-in OMS, Event log and Web API plugins in SquaredUp can be used for this.
- Display SQL recommendations from the OMS SQL Assessment on all servers running SQL
- Show change tracking events on the alert page.
- Missing security updates on Windows server perspective.
Disable email alerts
It may be a bold statement, but if you manage to implement a few of the things i have listed and maintain your good tuning and MP implementation procedure. Chances are that you can start to disable alerting by email or at least get your colleagues more involved when they have the chance to properly use all the data and possibilities when having a SCOM installation in your environment.