Possibly the biggest problem in IT today is the increasingly narrow focus and specialization which prevent anyone from gaining an overall understanding of what is going on with all of the various bits and pieces of IT infrastructure.
Last week’s IP Expo Europe event highlighted this problem, with the split into a number of sub-events, all located under one roof, and distinguished by different colors – including on the walls and even the carpet. From a vendor’s point of view, it’s difficult to choose which area to set up in. For someone working in AIOps the choice is easier: AIOps by definition is about working across all of these different domains.
As Information Volume Goes Up, Data Quality Drops
These different themes also came up in the keynote by journalist and author Andrew Keen, titled “When AI Kills.” Taken at face value, that topic would not seem to be terribly well aligned to Moogsoft and AIOps, but his theme was actually more about various mis-applications and negative consequences of IT.
Perhaps naturally for a journalist, one aspect he touched on repeatedly was the impact of AI and of tech in general on the media. His argument was that when anyone can publish and there are no gatekeepers, the overall quality of media goes down and this outpouring of free content risks breaking the business model of traditional media outlets.
AIOps moves IT from being the cause of business problems — outages, failures, or performance slowdowns — to the enabler of business success.
There is an interesting analogy to IT enterprise service assurance here. In both worlds, the old model had only a small number of information sources, whether newspapers or monitoring tools, which inherently limits the amount of information being generated at any time. It was feasible to consume, if not all of the data, at least a good sampling — enough to assemble a good picture of the actual situation.
As the number of information sources increases, though, and the output for each source goes up, the quality of individual packages of information — news articles or monitoring events — inevitably goes down. Old models of information management break as they struggle to keep up with radically more dynamic data lifecycles, played out at enormously accelerated timescales.
Moving Away from the Metal
In a world of bare-metal servers, each dedicated to a specific use case, monitoring events and metrics can easily be tied to business impacts. However, over the years we introduced virtualization, multiplied the layers of the application stack, automated many daily operations, and are now bringing these various trends to the next level with serverless or FaaS architectures. With all of these layers of abstraction between the hardware and its users, the direct connection between infrastructure and business services which we once relied on is no longer possible.
This is the shift from monitoring to observability: a focus on understanding the system based on its external behavior first, rather than trying to model it based on definitions, events, and metrics that must be laid down beforehand.
No one vendor can possibly offer everything that is needed to address these new realities. Instead, much of the conversation at IP Expo was about the emerging notion of the IT Ops toolchain. Multi-vendor toolchains that integrate commercial, open-source, and home-grown tools have long been a reality in DevOps, but the idea is becoming more accepted also on the IT Ops side. To name just vendors whose booths neighbored Moogsoft at the event, we had the likes of Slack, Chef, Puppet, and Red Hat — even Cisco, which has embraced the diverse software stack reality.
IT Ops in Transition
The truth is that this transition from monolithic and tightly managed to distributed and loosely coupled infrastructure is still very much in progress, which was why I titled my own presentation “Why The Cloud Still Needs Sysadmins.” All of this technology still requires people to run it and to decipher the data. The tools do not understand your company’s business and what you care about specifically that other users might not.
That’s the job for people. They can do it well without constant distractions by a flood of irrelevant alerts which must be investigated, documented, and the documentation tracked and archived and made available for future review — and then doing it all over again for every alert.
This is where AIOps comes in. Far from Andrew Keen’s theme of AI killing value, the specialized AI techniques, developed specifically for IT Ops, help to unlock new value from information and capabilities that were already present in the organization. The knowledge is out there, it is simply not easily accessible when it is needed.
Who Brings the Intelligence to AIOps?
Another aspect worth mentioning is one that came up in several discussions at the Moogsoft booth. AI is a field that is still developing rapidly, at least where it comes to practical applications. IT professionals are legitimately concerned about the need to hire hordes of data scientists and invest time and resources that they do not have before seeing any results.
Moogsoft’s approach to AIOps is that it should be a tool for IT Ops professionals, ready for them to use in the real world – not the basis for a year-long project in a lab.
Between our massive automated noise reduction, correlation across multiple different data sources, and most importantly, our innovative “digital war room” collaboration environment, Moogsoft AIOps uses AI and machine learning to help make the best use of the tools and skills that are already present in the organization. This is how IT can move from being the cause of business problems — outages, failures, or performance slowdowns — to the enabler of business success.