Maintenance Mode based on Event
Automating Maintenance Mode in OpsMgr is probably one of the most blogged topics, but every environment has different needs and solution to this problem. This is just another one.
I suggest you gather your information and browse the internet to find the solution that best fit your needs.
Earlier, i have posted a solution to maintenance mode a SCOM group using Powershell and Windows task scheduler. By doing this, we sorted out a problem with computers that was scheduled to reboot once per day. This post is about putting computer objects in to maintenance mode based on a windows event id. What we are trying to solve is when people forget to MM before boot or Windows Update automatically, randomly reboot computers during it’s patch window. Both scenarios will add an event id to system event log, and since we’re working with Events, this solution can be adapted to MM for any event id you’d like.
What we need to do:
- Create a new rule to look for the events we want (1074 in this case)
- Create the Powershell script to actually do the work
- Create a command channel to call the script
- Create a command channel subscription and subscriber
Creating the Alert Rule
There are several events that you could use to trigger this rule. In this post i will be using 1074 for both user initiated reboot and Windows update.
I assume you already know how to create a rule and a new management pack, so we will skip that part. As for the rule configuration it will look like this:
The highlighted areas is the difference between the two groups. Explorer.exe is triggered in parameter 1 when a user clicks start>reboot etc. EventID 22 is logged when Windows Update is finished installing new updates. After this event is triggered, the computer will boot within 15min (default). If you want to enable MM for updates in an earlier stage, i suggest looking for events triggered when patches are downloaded, installed or similar.
You will find the parameter # in the xml of the event. Once you have the formula in place, complete your rule with alert message, severity etc. For reference, this is how my formula looks like.
( ( ( Event Source Equals Microsoft-Windows-WindowsUpdateClient ) AND ( Event ID Equals 22 ) ) OR ( ( Parameter 5 Equals restart ) AND ( Parameter 1 Equals Explorer.EXE ) AND ( Event Source Equals User32 ) AND ( Event ID Equals 1074 ) ) )
Now, if you don’t want to include maintenance mode for servers being shut down. You can use Parameter 5, which includes the shutdown type. Putting “restart” in as a parameter will only trigger the rule when a computer is rebooted and not shut down.
Using event parameters, and not just wildcard search the entire event description is to reduce the performance impact on the agent computer. Wan’t to know why, check Kevin Holman’s post about event detection.
Creating the Powershell script
During my research for maintenance mode automation, i have come across many PS script, some of them was written for OpsMgr 2007, which means that they won’t work ‘out of the box’. operatingquadrant was one of my resources, which had written something similar back in 2009.
This is how my script look now, not the most complex script out ther, but it does it job for now.
param($sHost) #Parameter for computer to MM. This is passed to the script from the Command Channel: $Data/Context/DataItem/ManagedEntityDisplayName$ $ServerName = "ScomMS.fqdn" #Your desired MS Import-Module OperationsManager #Load opsmgr ps module New-SCOMManagementGroupConnection -ComputerName $ServerName #connect to managment group $Time = ((Get-Date)).AddMinutes(20) #Minutes maintenance should be active $class = Get-SCOMClass -name "Microsoft.Windows.Computer" #query class $computerObj = $class | get-scommonitoringobject | where {$_.name -like "$sHost*"} #Find object (computer) based on class #======Event log config=====# $logname = "Application" $logsrc = "Maintenance Mode Script" $eventid = "1010" $eventlvl = "Information" #======Event log config=====# #====Run this to write new event source=====# #New-EventLog -LogName $logname -Source $logsrc #===========================================# Write-EventLog -LogName $logname -Source $logsrc -EntryType $eventlvl -EventId $eventid -Message "The follwing objects where put in to MM $computerObj" #write the event log Start-SCOMMaintenanceMode -Instance $computerObj -EndTime $Time -Reason PlannedOperatingSystemReconfiguration -Comment "MM started automaticly." #maintenance mode the object
This script also writes an event to the management servers application log, each time the script runs. To be able to write the events, you will have to add a new event source. If you don’t want to write these events, just comment out the whole part.
To register a new source run the following line in the script:
New-EventLog -LogName $logname -Source $logsrc
Without event log you should have a script like this
param($sHost) #Parameter for computer to MM. This is passed to the script from the Command Channel: $Data/Context/DataItem/ManagedEntityDisplayName$ $ServerName = "ScomMS.fqdn" #Your desired MS Import-Module OperationsManager #Load opsmgr ps module New-SCOMManagementGroupConnection -ComputerName $ServerName #connect to managment group $Time = ((Get-Date)).AddMinutes(15) #Minutes maintenance should be active $class = Get-SCOMClass -name "Microsoft.Windows.Computer" #query class $computerObj = $class | get-scommonitoringobject | where {$_.name -like "$sHost*"} #Find object (computer) based on class Start-SCOMMaintenanceMode -Instance $computerObj -EndTime $Time -Reason PlannedOperatingSystemReconfiguration -Comment "MM started automaticly." #maintenance mode the object
Command Channel
Operations Manager is perfectly capable of triggering a command/script when an alert is logged. I will show how a command channel for triggering the above PS script is set up. Be aware of command channel’s limitations regarding async. proc.
Our “AutoMaintenanceMode” command channel is set up as in this picture. the one thing you need to pay attention to, is how we pass the computer name to our PS script. In command line parameters put this after your script path $Data/Context/DataItem/ManagedEntityDisplayName$
Next is to set up our subscriber and subscription.
Setting up a subscriber and a subscription for the subscriber is pretty straight forward. The first thing we will need is a Maintenance Mode subscriber. I suggest you name it so it’s easy to see that this is an automated process.
In the channel tab, choose command and then select your command channel.
Complete the wizard, and continue to set up a subscription. Again, naming your subscription makes it easy to understand.
Quickly, set op your criteria, here’s what we use. The only thing you actually need is “created by specific rule […]”
For your subscription and channel, you simply select the ones you created earlier. A summary page will look something like this
Name
Auto: Put Computers in MM when booted
Description
Criteria
Notify on all alerts where
created by Reboot initiated rules or monitors (e.g., sources)
and of a Information severity
and of a Low priority
Subscribers
Maintenance Mode
Channels
AutoMaintenanceMode
That’s all there is to it! If everything is correctly set up. Next time one of your colleagues reboot a computer, SCOM wil automaticly place that object in to Maintanance Mode.
And as always – always test before implementation
Footnote
When using a command channel to perform this operation you will be limited to “maximum number of asynchronous process” which defaults to 5. You will get an alert saying something with “script dropped” or Operations Manager failed to start a process due to lack of resources
What is actually happening, is that when the limit of 5 is reached OpsMgr is protecting it self from starvation in case of an alert storm etc. This limitation can be raised, but i suggest to wait and see if you actually have this problem.
Another way, is to drop the command channel entirely and use SMA, orchestrator etc. to catch the alert and run the powershell script.