The Evolution of AI in Network Operations: From AIOps to GenAI

Generative AI is Accelerating AI Adoption

Artificial Intelligence (AI) is rapidly evolving and beginning to transform IT operations with expanding use cases that offer efficiency to labor-intensive network operations, particularly in Network Operations Centers (NOC) and Security Operations Centers (SOC). The recently formed ONUG Collaborative AI-Driven NOC/SOC Automation Project is exploring this evolution and its implications in this first of a series of blog posts on the topic.

Over the past several years AIOps has become a buzzword that vendors use to offer the promise of AI-assisted operations. Vendors began to offer AI-based features that use machine learning (ML) to offer network insights, anomaly detection, streamlined processes and more. While AI that leverages a predictive ML model is expected to provide a lot of value, building technology stacks and training data presents many challenges. Most organizations have struggled to build their own AI/ML stacks and training data. Only a few of the large vendors have been able to launch AI-enhanced capabilities.

With the advent of Generative AI (GenAI), led by the launch of OpenAI’s ChatGPT in 2022 followed by Google’s Bard (now Gemini) and others, these Generative Pre-trained Transformers (GPT) leverage neural networks and deep learning which enable the ability to generate new content. This new type of AI can create new content based on what it has learned from existing content. The approach used is very straightforward for users, similar to internet searching where text-based prompting is fed into a Large Language Model (LLM) that is pre-trained with large data sets and offers powerful generative capabilities to generate text and perform tasks.

ChatGPT Opened GenAI to the World

OpenAI’s release of ChatGPT marked a significant milestone, providing consumers free access to an LLM that is pre-trained with very large data sets. As IT users began to play with ChatGPT they found that it was able to add value to their day job since the training data included networking domain, networking vendor-specific information, programming languages and more. Early users were excited to see that it could generate example vendor-specific network configurations, code, requirement documents, technical documentation and more.

Before most employers established policies around the use of GenAI, IT teams, including operations staff, started using chat-based products like ChatGPT, Microsoft Copilot, GitHub Copilot and others. Example use cases that benefited IT operations include:

  • Sample document creation includes operational processes, technical documentation, requirement documentation and more
  • Automation assistance routine tasks like CLI config generation and data analysis (including log data)
  • Generating example code and data models including technologies like Python, JSON, and others
  • Gaining insights to emerging cyber threats by aggregating and analyzing data from various sources
  • Using the chat-based interface to enable non-experts to get domain-specific knowledge to help with daily tasks and operations

These use cases demonstrate how GenAI can augment IT, network operations, and security operations by assisting in routine tasks, improving response times, and enhancing overall cybersecurity resilience.

GenAI-powered Co-Pilots Lead the Charge

With the emergence of GenAI products targeting the enterprise and powered by the rapidly evolving LLMs including OpenAI’s GPT-3.5/GPT-4, Google’s Gemini, Meta’s Llama 3, Anthropic’s Claude 3 and others, AI capabilities and use cases for the NOC/SOC are expanding. These products offer on-prem or cloud-based enterprise editions with direct API access to the AI model, enabling integration with existing management tools and the ability to train models with specific datasets. Enterprises are gaining access to “AI-powered products” that enhance efficiency and augment human capabilities in NOC/SOC operations. API integration enables vendors to offer their own Chatbot that leverages the capabilities of a foundational LLM along with function call integration to add on functionality for specific use cases.

Gartner’s Andrew Lerner published a note titled “Invasion of the Network Co-Pilots” and called out 2024 as the year of the network co-pilot powered by GenAI.

The initial wave of productization focuses on chatbot-style interfaces, offering integration with existing products and assistance to end-users in NOC/SOC environments. Example use cases of GPT-powered copilots integrated with products used for network and security operations include:

  • Custom support chatbots
  • Code generation (e.g., Python, Ansible and domain-specific) to automate tasks and configurations
  • Generating vendor-specific CLI configurations, improving interoperability
  • Tailored data analytics
  • Correlating data sets to derive actionable insights about network health and device performance
  • Providing information about compliance and security posture, aiding in regulatory adherence

GenAI is also being applied outside the typical GPT-like models and interfaces.  For example, security product developers are combining GenAI with Reinforcement Learning to augment threat libraries for advanced IDS solutions.  Also, internal operational data is being integrated with established Large Language Models using Retrieval Augmented Generation (RAG).

Enterprise IT is Cautious but Optimistic

While many organizations are cognizant of and realize the benefits of these capabilities, corporate policies often restrict the use of publicly shared services to limit the leakage of proprietary or protected data into the public domain.  Architecture review boards are typically in place or being established to govern the usage of AI. Legal compliance, data security, and the accuracy of results, including the risk of AI hallucinations, are paramount considerations. Striking a balance between leveraging AI for operational efficiency while safeguarding sensitive information remains a challenge. These concerns could slow broader adoption; however, it is a fast-moving technology that will address the challenges and evolve quickly driven by productivity and efficiency gains.

GenAI accuracy and correctness will limit initial use cases to require human review and oversight. In this wave of initial adoption, use cases at their current state will likely be limited to “co-pilots” for productivity-enhancing features, not auto-pilot and not human replacement. GenAI platforms (like ChatGPT) are more likely to be error-prone for very specific domain knowledge in which productized chatbot interfaces (co-pilots integrated into vendor products) are expected to be trained and tested for specific supported scenarios. This puts pressure on the vendors to ensure training data is used and guardrails are implemented to minimize errors.

The integration of AI, particularly GenAI, is reshaping the landscape of network operations, empowering NOCs and SOCs with advanced capabilities. While the benefits are substantial, to fully harness the potential of AI in network operations, enterprises must navigate challenges such as data security and compliance. As GenAI continues to evolve, its impact on NOC/SOC efficiency and effectiveness is poised to grow exponentially, beginning a new era of intelligent IT infrastructure management. The increased productivity this can offer could offset the challenges in skill set development and hiring. Overall, GenAI products and integrations offer a scalable and adaptable solution for NOC/SOC environments, empowering teams to work more efficiently, respond to incidents faster, and stay ahead of emerging threats in today’s dynamic cybersecurity landscape which will likely be the drivers for early adoption.

Author's Bio

Michael Haugh


Including contributions from Tom Pagan, WWT and members of the ONUG Collaborative AI-Driven NOC/SOC Automation Project Team