Extending a Ticket Analyzer for AI-based Alert Management in Various IoT Domains

For quite some time now, data scientists and machine and deep learning (ML/DL) experts are using different techniques to automate triaging and analysis of tickets in various hi-tech sectors. Extraction of meaning and context from the bug-reports or tickets enables a partial automation of the workflow in the defect management process. When this automation is integrated with an automated digital workflow tool, it creates complete automation of defect management and significantly improves its productivity and accuracy. Apart from automation, an AI/ML-based ticket analyzer solves the following problems of a manual process.

Schematic representation of a ticket analyzer
Figure 1: Schematic representation of a ticket analyzer
  • Effort, experience requirement, and bias: Significant effort is spent for allocating bugs, found during testing, to a development team or to a support center. For a large development project, a team of triaging engineers, who understand the underlying system characteristics and secret sauce of architecture, is deployed. It has been found that even with a trained and experienced team, bias creates delay and inaccuracy in allocation.
  • Assignment of severity: While priority is process and context sensitive, the severity of a bug depends on the design and the system requirements. Understanding severity takes experienced engineers and some time for a greenhorn to assign the correct severity to the bugs and allocate it to the appropriate developer.
  • Removal of noise and “not a bug”: A significant effort goes in understanding whether a reported bug is a feature or a bug and in other terms, whether it is a false positive.
  • Root cause for a set of bugs: Sometimes, a bug or an issue manifests itself in different ways and different places in a software. Identification of the root cause can save a lot of effort during the solution process. In a manual system, that process is dependent on the experience level of the triaging engineer.

Today, some of the well-known workflow management products use some form of AI-based tools for categorization and appropriate assignment and these products are available commercially.

Though they started with the development process of large software projects and automation of infrastructure-maintenance, the ticket analyzers, when applied appropriately, have a good potential in automating many other operator-managed processes in various IoT domains. For example:

  • Alerts and logs of an IaaS application in a data center
  • Alerts of a SCADA or building management system
  • Alerts of an IoT-ized smart city command center
  • Alert-based retrofitted predictive maintenance module in an IoT-ized system
  • Patch identification and management based on the defect classification

This can also be automated by a similar automation scheme. The knowledge base of these aforementioned systems, which an experienced human operator learns and uses to identify, categorize, and troubleshoot, can be appropriately encapsulated in a properly trained AI system similar to what is available in a ticket analyzer. The essential components of this type of an AI system are:

Figure 2:  Components of an AI/ML-based alert manager of an IoT system
Figure 2:  Components of an AI/ML-based alert manager of an IoT system

The basic idea of the system in Fig 2 is to divide the alerts/bugs/logs into multiple categories and automatically direct the categorized alerts to appropriate attendants/processes.

Historical Data: Properly labeled historical data for appropriate categories should be prepared from the raw data. This is an important step and needs to be properly crafted based on the requirement of the workflow and a large amount of data is necessary for the system to be accurate.

Text Analytics: This is the most important step in the entire AI system. First, topics are identified for categorization. Various natural language toolkits can be used to build the models of topics and identify topics that will be used for classification in the later steps. In some cases, a rules engine can also be used for topic generation.

Model Development: Next, an ML or a DL model can be created to train on the classification according to the aforementioned topics. In the recent past, RNN with an LTSM layer has performed very well for a reasonable number of topics and complications. The trained model can be deployed for online classification.

Classification and Clustering: In actual deployment, the aforementioned models are deployed as a part of the alert management pipeline. Whenever the alerts are generated, the text associated with the alerts is processed through the model and classified into topics. Clustering is used for the root cause analysis for a bunch of alerts generated before a specific problem.

Though the core of an alert management system is text analytics, which is very similar to a ticket analyzer, most of the alerts have variables and their associated values. Their quantitative nature enables a few useful use cases to be implemented in an alerts management system.

Variables in Alerts:  Unlike ticket analyzers, alerts contain variable(s) with their value(s). When the value of a variable in an alert is below a threshold, alert is simply noise of the system. Sometimes, this noise is determined by a group of variables. An alert management system can identify the cause and the severity of problem from those variables and inform the attender accordingly.

Failure Prediction: It is possible to identify performance degradation and suboptimal operation based on some variables in the alert management system. A properly designed alarm management system not only channelizes the alerts to specific categories or finds a root cause of a problem, but it also creates a threshold for an impending degradation or failure.

Automatic trigger of action:  An alert management system can automatically trigger an action for immediate mitigation of the problem for which alerts are generated.

Summary:   AI based tools for triaging bugs and analysis of the tickets raised by test engineers have been used for some time now. These tools improve efficiency of allocations and productivity of fixing bugs significantly. These tools help automate the whole workflow of triaging, allocation and fixing process of bugs. Some of the core modules these kind of AI tools can be used to develop an automated alert management system.

AI-based alert management systems can improve the productivity of IoT, IaaS application, SCADA and similar systems. Additional features/ use cases need to be implemented over a ticket analyzer to make an efficient and complete alert manager.



Related Posts

Gen AI blog banner

[Infoblog] Generative AI Shaping Future Industries

Generative AI is at the forefront of innovation, harnessing the power of machine learning algorithms to create new and original content, from images and music to entire virtual environments. This infographic depicts how Gen AI is evolving industries and shaping its future.

5G NR-Light (RedCap)

5G NR-Light (RedCap): Powering the Future of IoT

5G NR-Light (RedCap) technology is poised to revolutionize the Internet of Things (IoT) landscape. NR-Light or RedCap signifies a modernized and optimized version of the 5G New Radio (NR) standard. RedCap indicates its potential to significantly enhance power efficiency and capabilities for IoT devices. 5G RedCap is designed to enable seamless connectivity for massive IoT devices, from smart sensors to industrial machines, creating an interconnected ecosystem that can transform industries and daily life. Read the blog to explore the significance of 5G RedCap for massive IoT adoption.

The Future of IoT Networking: Key Technologies to Know!

The Future of IoT Networking: Key Technologies to Know!

The future of IoT networking is modeled for significant innovations, driven by key technologies that promise to revolutionize the way devices interact and communicate. The key technologies and innovations together will foster a more connected, intelligent, and secure IoT ecosystem, paving the way for a transformative future. Read on the blog to understand different wireless networking technologies and their performing characteristics which realizes efficient and secure IoT network.


Enhancing vCenter Capabilities with VMware vCenter Plugins: A Deep Dive

 vCenter Server is one of the most powerful tools in VMware’s product portfolio, enabling efficient management of virtualized environments. One of the most used features in vCenter is the vCenter plugin, which extends the capabilities by providing custom features such as 3rd Party system discovery, and provisioning, providing a unified view, allowing administrators to manage vSphere, and 3rd Party systems seamlessly.

Generative AI and the changing face of Software Development Lifecycle

Generative AI and the changing face of Software Development Lifecycle

Generative AI is revolutionizing many IT segments, and one of such segments is software product development lifecycle (SDLC). This blog summarizes how Generative AI is transforming SDLC with its applications, benefits, and examples.

Generative AI: Transforming Industries for Success

Generative AI : Transforming Industries for Success

Generative AI is the hot topic of discussion everywhere and is being embraced by everyone. Read this blog to explore how different sectors are leveraging Generative AI to drive innovation, enhance efficiency, and deliver superior experiences.


Leave a comment / Query / Feedback

Your email address will not be published. Required fields are marked *