This blog is the fifth and final in the series, “How AIOps Liberates IT from a Rules-based Approach”.
Summary
1. AIOps liberates IT professionals from rules-driven workflows that are repetitive mental drudgery, providing the opportunity to advance new skills, knowledge, and productivity.
2. Because AIOps is a mathematical machine learning approach, a single algorithm can replace the logic of hundreds of thousands of rules.
3. IT Operations must move beyond the antiquity of rules-based solutions and put the modern machine-learning of AIOps to work, sooner not later.
I grew up in England close to the birthplace of the Industrial Revolution. The wrenching, cataclysmic social changes in the transition to the new steam powered and automated manufacturing processes between 1780 – 1830 were prelude to a new world. An agrarian and subsistence economy was revolutionized by machines used to automate production of textiles and other goods at a mind boggling scale. You could say the drive to mechanize became the philosophical core of twentieth century life. Orderly systems. Cogs in a wheel. Predictable behaviors. Pre-determined results. All governed by rules. All beautifully captured by Charlie Chaplin in Modern Times.
Moving Beyond Rules
The days of the “Spinning Jenny” are long gone. But rules are still everywhere. As we grapple again with the turmoil of change in the twenty-first century, it’s a good time to ask why. This question is particularly germane as many people wonder if AI and machine learning (ML) is about to trample life as we know it – just like the Luddites did during the Industrial Revolution.
The rules I speak of are not the guardrails of everyday life like obeying traffic laws, complying with health regulations, and paying your taxes (or else).
Moogsoft is about improving IT Operations at enterprises which – despite being run globally at scale on esoteric virtual cloud systems – typically bet their performance and uptime on rules.
In this educational blog series, we’ve dissected serious limitations with rules-based solutions and why they are insufficient to effectively manage IT Operations. We’ve learned that:
- Rules have the illusion of simplicity, but instead add exponential complexity because they are brittle and easy to break
- Rules are expensive to maintain and carry hidden costs
- Rules are unpredictable in complex environments due to their tiny scope.
- Rules are undecidable in real-world failure scenarios, which renders them deficient for continuous service assurance.
AI and Machine Learning are New Enablers
We’ve also described how AI and ML liberate IT from each and every limitation of rules. The fundamental difference is how AI/ML uses monitoring data. A rules-based system separates the infrastructure you’re trying to understand from the data it produces. To predict trouble, it applies pre-built logic to the alerts generated by system events. As I’ve outlined in this blog series, this doesn’t always work as advertised.
AI/ML takes the opposite approach. It does not treat data as separate from the system. You cannot go from state to event, but you can go from event to state. This approach assumes there is signal in the noise produced by alerts, which is interpreted mathematically by statistical machine learning to infer the existence of issues worth investigating. And if you think that notion is an oddity, this relationship of the system to the observer has been common currency for centuries in everything from eastern mysticism to quantum mechanics.
In the practice of IT operations, we are just now realizing you cannot separate the two.
AIOps allows you to discover incidents previously not detected by a rules-based solution. Statistical ML can use the same algorithm to infer one type of instance from other types. Algorithms themselves are error-resistant and don’t need to have all of the data to make reliable conclusions. Algorithms are deterministic – meaning that they always produce the same output, regardless of input. Algorithms work no matter the order in which the data is processed. Because AIOps is a mathematical ML approach, a single algorithm can replace the logic of hundreds of thousands of rules. In fractions of a second.
All of this sounds so straightforward, it’s almost magical. So why isn’t everyone using AIOps?
AIOps allows you to discover incidents previously not detected by a rules-based solution.
Breakthroughs and Impediments for Change
For one thing, the application of statistical ML in IT operations entails fairly new techniques. The concepts have been around since Alan Turing’s breakthrough 90 years ago. The foundations for modern AI algorithms were laid during the 1960s, ‘70s, and ‘80s. The earliest commercial applications were stock trading systems, then followed others such as fraud detection and handwriting recognition.
Three recent breakthroughs have finally facilitated the wide adoption of AI/ML for IT Operations.
First, statistical ML requires very powerful computers, which have become common only in the last decade. Second, statistical ML requires lots and lots of data. The ease of storing, accessing, and using Big Data is finally practical thanks to the global cloud, and of course thanks to the ironclad continuity of Moore’s Law on storage and compute. Third, knowledge that was once the exclusive preserve of academic computer science is now spreading to the wider IT Ops community.
What may be the only remaining barrier to pervasive use of AIOps? Resistance by IT professionals!
We’re only human. People tend to be suspicious of new ML approaches because they don’t understand them. The topic is certainly complex. It’s easier for people to wrap their heads around Boolean logic, which is an old and familiar way of thinking. Their comfort zone is stretched when you talk about neural networks, similarity as a range not a Boolean, back propagation, high-level calculus, category theory, homology and probability – all advanced, mathematical terms unfamiliar to many IT professionals.
AIOps Brings a Future of Change and Hope
Despite these issues, there’s great hope for AIOps because it is a small part of what I call Industrial Revolution 2.0 – which encapsulates all of the innovations springing from AI/ML. The first Industrial Revolution lifted people from a life of misery, hunger and poverty. In Revolution 2.0, people will see positive albeit unpredictable improvements in their work and life.
The Revolution of the 19th Century produced then unpredictable social innovations that permanently changed society. Technology inventions included textile manufacture, iron production, steam power, machine tools, chemicals, cement, gas lighting, glass making, paper machine, agriculture, mining and transportation. Social effects included the factory system, improved standards of living, clothing and consumer goods, urbanization, better life for women and families, and safer labor conditions.
Powered by AI/ML, Revolution 2.0 will also change our lives. I am optimistic these changes will be positive. For example, if you are an IT Operations professional, consider your current work life. Frankly, the rules-based world mostly provides workflow that is mental drudgery: menial, repetitive, and low-pay. AI/ML will automate that miserable life and provide the opportunity for IT professionals to up-level the application of their knowledge in a more productive and enjoyable fashion.
We cannot ignore the massive changes that AI will bring to job roles and career opportunities. The World Economic Forum predicts the use of AI will cause 75 million jobs to be displaced by 2022. However, 133 million new ones will be created – a net increase of 58 million. With this change is the requirement that today’s workers “reskill” for these new opportunities. The 2018 study projects that “no less than 54 percent of all employees will require significant re-skilling and upskilling.” If you work in IT, it is very important to begin planning for this change NOW.
Moogsoft has many large enterprise customers that are already reaping huge benefits from AIOps. Our new approach is enabling them to accelerate mean-time-to-resolution of operational incidents, improve service assurance for customers, simplify the management of cloud infrastructure, and more effectively manage digital transformation initiatives.
It’s time for IT Operations to move beyond the antiquity of rules-based legacy solutions and put the modern machine-learning of AIOps to work. It is a better approach for delivering continuous service assurance to the enterprise.
Read the previous blog in this series: The Undecidable Challenge of Rules