Communications Provider Accelerates Incident Detection and Resolution using AIOps from Moogsoft
Alert volume also dropped by 90%-plus, and customer-impacting incidents fell by 30%
Overview
This leading outsourcer of cloud-based communications and collaboration solutions for enterprises had lost visibility and control over its IT environment.
“I bought Moogsoft to gain insight into our alerts so that I can sleep better at night.”
– Director of Technology
Key Challenges
This 10-year old organization used a variety of system monitoring and management tools like Microsoft SCOM (System Center Operations Manager), Splunk, SolarWinds, and Cacti, as well as various homegrown solutions to create email notifications for their operations teams. With about 15 people across the NOC, systems operations, infrastructure and applications teams, managing incidents proactively was a big challenge.
Through SCOM, operations teams had visibility into 40% of the total alert volume. The rest was turned off to avoid further alert overload. From these email alerts, 300 to 400 tickets were created each week for the NOC team to manage, but 70% of these tickets were closed without any action taken. Furthermore, when a P1/P2 incident did occur, all-hands conference calls were conducted.
It took the operations teams about two hours to detect incidents and another two hours to resolve them. They were operating reactively — over 70% of incidents were detected by customers first.
“Because there was such a high volume of alerts, we could only look at critical alerts when things were breaking,” the NOC manager said. “The ‘lows’ and ‘mediums’ that could be leading to problems would always be missed. It was like firefighting.”
“Because of SCOM’s server-level focus, it was very difficult to determine whether a larger part of the environment was being effected as a whole, since we were just concentrating on alerts coming in from one server,” the NOC manager said.
After years of challenges, they decided to evaluate Moogsoft.
Moogsoft
Today, all data from across their toolsets, including SCOM, feed into Moogsoft, which is now a direct interface into their ticketing system.
“We are using the same tools but the way in which we are using them has completely changed. We have turned on all alerts and are sending everything to Moogsoft for full visibility,” the NOC manager said.
Moogsoft has helped this organization achieve a 90% reduction in workloads, a 30%
reduction in customer-identified incidents, a 75% reduction in MTTD (mean time to detect), and a 25% reduction in MTTR (mean time to resolve).
CUSTOMER PROFILE
Key Challenges