This is the second in a series of blog posts by the ONUG Collaborative AI-Driven NOC/SOC Automation project team, which is delving into Generative AI (GenAI) use cases for streamlining and automating NOC/SOC workflows. Here we describe several different approaches for adopting GenAI-enabled applications and tools and conclude by considering the benefits and challenges of each approach.
AI is playing an increasingly important role in network and security operations. The use of machine learning to analyze and derive actionable intelligence from large volumes of operational data has become commonplace, spanning use cases including baselining, anomaly detection, and event correlation, often linked with automated incident response. However, as described in our previous blog post, GenAI technology is poised to dramatically impact NOC/SOC applications and tools.
Potential use cases include natural language interfaces, simplified reporting and documentation, automatic code generation, event diagnosis and recommending remedial actions. GenAI utilizing either Large Language Model (LLM) or Small Language Model (SLM) technologies can be implemented and delivered in various ways, with different approaches having inherent advantages and disadvantages. This post describes three approaches and a basic framework for evaluating each, while reaching a straightforward conclusion.
Implementation Options
Let’s consider the three major approaches to adopting GenAI in NOC/SOC environments:
- Publicly-available GenAI services
- GenAI-augmented applications and tools
- Custom solutions developed in-house
Publicly-Available GenAI Services
(LLMs and potentially industry or context-specific Small LMs)
Unless you’ve been living off the grid, you are already familiar with publicly available GenAI services such as ChatGPT and Microsoft Copilot, which have demonstrated the vast power of GenAI and its potential to speed up knowledge work and improve productivity. These services allow everyone to experiment with GenAI, learn how to write effective prompts and gain firsthand experience with the outputs these general-purpose services generate, utilizing primarily LLMs, although SLMs are gaining traction for domain-specific use cases where the models can be properly trained on relatively modest amounts of data, compared to a LLM.
GenAI-Augmented Applications and Tools
In enterprise IT environments, commercial application and tool suppliers are racing to augment IT operations products with GenAI technologies to improve usability by simplifying how users interact with these products but also to enhance functionality by using GenAI to automate the generation of code, documentation, reports, etc. These products are typically based on LLMs and/or SLMs trained on a combination of publicly available data (for example, natural language processing), application-specific data (for example, code generation) and locally relevant data specific to the user’s environment (for example, documentation and reports). Given the inherent complexity of NOC/SOC operations and the sheer volume of text-based data that needs to be continuously collected and analyzed, augmenting applications and tools with GenAI capabilities is the proverbial “low-hanging fruit” in these environments.
Custom Solutions Developed In-House
The third approach is to incorporate GenAI technology into custom solutions that are developed in-house (or by contracting with a custom software development shop). This may be the only viable approach for enhancing the capabilities of applications and tools that have been developed in-house and are maintained internally. Alternatively, a use case may be so specific to the NOC/SOC operational environment that custom development is the only viable option. Developing such a custom solution requires selecting the appropriate LLMs and/or SLMs and then training these models properly using data sets specific to each operational use case and continually tuning these models over time to ensure that they remain effective.
Key Considerations
Evaluating differen approaches for adopting GenAI in NOC/SOC environments requires examining each of the following:
- Availability of GenAI products and services
- GenAI data curation
- Fit for purpose
- Ease of adoption
- Key risks
Availability of GenAI Products and Services
- Prudent enterprise IT organizations have severely restricted or prohibited the use of ChatGPT and other publicly available GenAI services, driven by valid concerns about privacy, security, fitness for purpose, accuracy, bias and other legal/ethical concerns. As a general rule, these are currently not a viable option for the vast majority of NOC/SOC use cases.
- Suppliers are delivering GenAI-augmented applications and tools today, often in the form of conversational interfaces for existing NOC/SOC products and services. This segment of the market is highly competitive, with suppliers vying to retain current customers and winning new business by leveraging GenAI as a key differentiator. New capabilities are being announced weekly.
- When it comes to developing and supporting custom GenAI solutions in-house, there is wide variance among enterprise IT organizations in terms of AI expertise, software development capabilities, IT governance policies and risk tolerance. While well-funded and properly staffed teams may decide to incorporate GenAI into their own applications and tools, this will remain a bridge too far for many companies, at least in the near future. The dollars to fund these efforts will also be competing against the dollars necessary to acquire GenAI-augmented applications and tools, which may prove to be an easier sell internally.
GenAI Data Curation
- The LLMs used by public GenAI services are trained on massive amounts of data that is not curated for NOC/SOC use cases, so these are a non-starter for the vast majority of operational requirements.
- GenAI-augmented applications and tools require models that are trained on data that is relevant to operational tasks and processes, which is primarily generated by a wide range of NOC/SOC applications, tools, security devices and network elements. Operations teams must ensure that these data sets are well-curated, up to date and in compliance with internal governance policies.
- Data curation requirements for custom solutions developed in-house are the same as for GenAI-augmented applications and services but development teams will likely have to put extra effort into ensuring that all model training and inferencing data is up to date, accurate and properly prepared for ingestion, given that GenAI-augmented applications and tools provided by suppliers will have at least some of these checks built in.
Fit for Purpose
- As noted above, publicly-available GenAI services using AI models trained on Internet data are generally not fit for purpose in NOC/SOC environments. The potential scope of these models is incredibly broad but they are useless for NOC/SOC use cases because they are not trained on the proper data sets.
- GenAI-augmented applications and tools are designed to operate within a targeted function or workstream consistent with the application itself, and if the supporting AI models are trained on the proper data sets, by definition they are fit for purpose.
- Custom solutions developed in-house, if based on AI models that are trained on the proper data sets, are also by definition fit for purpose.
Ease of Adoption
- While publicly-available GenAI models are readily available on the Internet, as noted above, they are generally not fit for purpose.
- New AI capabilities in GenAI-augmented applications and tools will typically be delivered by suppliers via product updates or new service offerings, which will simplify adoption, subject to internal IT governance, risk and compliance policies.
- Custom solutions developed in-house require extended timelines for deployment and extensive resources for development and ongoing maintenance, which involves a continuous process of data curation and AI model tuning.
Key Risks
- The greatest perceived risk of utilizing publicly-available GenAI services is data privacy and security. Sending data to public cloud-based servers for processing can lead to potential data breaches and loss of sensitive information. AI model reliability and stability is also a concern. If a GenAI service returns inaccurate data which is used by NOC/SOC personnel, it could result in disrupted operations and possibly security breaches.
- Relying on GenAI-augmented applications and tools could lead to dependencies on specific suppliers, making it difficult to integrate applications and tools from different vendors and prohibitively expensive to switch vendors for a certain application or tool. Products may also be tied to specific AI models which might fall out of favor over time, resulting in a different type of supplier lock-in, which is AI model-specific. It could prove problematic to move on from a supplier that utilizes a sub-optimal AI model or one that is not properly trained, testing and continuously updated.
- Custom solutions developed in-house involve high development and maintenance costs but also require recruiting, hiring and retaining in-house talent. There is always the risk that the development cycle drags on for too long, with negative consequences for NOC/SOC operational effectiveness and operating expense.
Conclusion
For all of the reasons described above, publicly-available GenAI services are the least viable approach for implementing GenAI use cases in NOC/SOC environments. While widely available and extremely powerful for generating compelling content, based on models trained on Internet data, these services miss the mark in the narrowly-defined confines of IT operations.
GenAI-augmented NOC/SOC applications and tools are not only becoming widely available, they are generally the most attractive option for leveraging GenAI in the near term, subject to the caveats noted above regarding data curation and risk mitigation. Customers benefit from packaged solutions that are fully supported by suppliers and ideally perfectly fit for purpose.
Custom solutions developed in-house offer great potential for unique operational requirements that can’t be adequately addressed by product suppliers, however, the development resources, infrastructure, processes and inherent risks are significant barriers to this approach. The well-known industry-wide shortage of AI talent is also a daunting challenge that may prove insurmountable.