We’re pleased to let you know that our second-favourite Belgian “van Damme” has returned for another guest blog – which means if he keeps this up he’ll soon dislodge Jean-Claude from the coveted number one spot (Universal Solideris an absolute film).
Jasper’s latest piece is on tuning SCOM alerts and is sufficiently juicy that it needs to be split into two parts. As always, we encourage you to check out the original content on Jasper’s blog but, with his blessing, we’ve also provided a copy for you below. Enjoy!
You can also read part two of this Tuning Alerts series.
By Jasper Van Damme
One of the obstacles when deploying SCOM for the first time is getting a handle on the amount of alerts. One of the reasons in my opinion, why SCOM sometimes has a bad reputation.
Luckily, there are a few things you can do to relieve you of some of the ‘alert burden’ 😊. This post is part one of hopefully many to get your alerts under control.
The first piece of advice I can give you is to set specific SCOM related alerts to informational.
Some alerts include:
Whilst the alerts are not completely unimportant, they are often categorized as ‘Warning’ or ‘Critical’ alerts, making them seem like a bigger issue than they actually are.
Once you have a few management packs imported you will see these alerts reoccurring a lot, sometimes comprising of up to 40% of the alert count, for alerts that are just related to SCOM!
The cause of the alerts are usually temporary issues like backups, and if they do not reoccur, they are not worthy of any attention. Furthermore to troubleshoot these alerts, you need good knowledge of SCOM as you may want to analyze how the rule or monitor is retrieving its data.
Furthermore the very critical agent alerts such as Heartbeat failures / Failed to connect to computer are monitors, which are not affected by these overrides.
In other words, for most operators, these alerts do not offer a lot of value.
By setting these alerts as informational, you can then filter them from the Active Alerts view by only showing the Critical / Warning alerts.
If you still want to view these alerts, you can go to the Operations Manger folder. I would then focus on alerts that have a high repeat count, as this may indicate an issue with WMI or other resources.
Using this approach, you still have a clue of which servers are having a lot of SCOM issues, as opposed to disabling the rule completely.
To create overrides for this, simply go to the authoring pane in the SCOM console, and scope to Health Service.
I would then recommend changing the severity of these alert rules:
Right click the rule you’d like to change the severity for:
Change the severity to 0 and store in your SCOM override management pack. Click OK.
If you are responsible for your SCOM environment, do not forget to check on the Operations Manager alerts, especially when you have imported a new management pack.
This wraps up this blog post, hopefully this has helped you getting those alerts under control!
Jasper is a Belgian freelance IT Consultant with 10 years infrastructural experience internally and externally of small (1 server) to larger (+1000 servers) environments in a variety of industries.