Artificial Intelligence (AI) is sparking innovation and transforming the data center landscape. Massive buildouts are required to support each new AI model, introducing increasing scale and complexity with every iteration. Staying competitive demands innovative approaches to boost operational efficiency and manage rising costs. In this blog, we’ll explore how a Network Source of Truth (NSoT), combined with a comprehensive automation strategy, can be a cornerstone of this transformation. The scale of AI computational power, and consequently energy consumption, is expected to surge in the coming…
As technology progresses and permeates all facets of modern life, the sophistication and scale of cyber threats continue to grow, presenting a formidable challenge to businesses worldwide. The consequences of a data breach can be catastrophic, meaning cybersecurity is at the top of the priority list. As such, it’s imperative for organizations to find efficient and effective solutions to protect their networks, safeguard sensitive data, and ensure business continuity. Cybersecurity Challenges in Modern Networks With your watch, refrigerator, and car all connected online, malicious actors…
Why has network automation failed to live up to expectations? After more than two decades of effort, millions of man-hours, and hundreds of tools created, network automation has largely not achieved its intended objective of eliminating manual operations. Multiple industry association surveys (e.g., Enterprise Management Association, Gartner) show that most enterprises have automated less than half of their data center tasks. Why is this so? When asked, most IT network operators would say that “automation” should help them do manual, repetitive tasks faster. But fast…
In today’s dynamic network environments, traditional MELT telemetry (metrics, events, logs, and traces) falls short when uncovering “unknown unknowns”—unforeseen issues that can cripple network performance and endanger network security. Packets and metadata generated from the packets data offer a more granular, real-time view, providing the level of detail necessary to identify these undetected anomalies and veiled vulnerabilities. The Shortcomings of MELT Telemetry While MELT telemetry can alert you to “known unknowns”—anticipated issues you’ve prepared for—its reliance on aggregation and predefined triggers leaves it blind to unexpected network events or…
On Oct 23-24 in New York City, we’ll be at the ONUG AI Networking Summit Fall 2024. Join us at the show and talk to Antonio Nearly, an LLM-powered holographic recreation of HPE CEO Antonio Neri that showcases an array of cutting-edge AI technologies. Antonio Nearly is highly knowledgeable about everything HPE, and you can ask him about anything else too — from professional soccer teams to blue whales. All you have to do is press a button and speak your question into a mic. It works…
Out-of-band monitoring techniques are necessary for AI clusters to provide trustworthy inferences. Out-of-band solutions provide latency analysis, decrease points of failure, and do not add additional burden on the network. [1] For AI clusters, the result of high latency are erroneous inferences. AI clusters are nodal network of GPUs that store and process inferences from machine learning models. [2] A slow latency for AI clusters inference is enough to produce incorrect inferences. The nodes of an AI cluster can create incomplete calculations on requests due to…
I recently came across an insightful blog by Bruce Davie in The Register that sheds light on the advancements in virtual network automation. While his analysis of virtual networks is insightful, he raises an important question about the state of physical network automation, “Are we actually any closer to the automation of networking than we were a decade ago?” My answer is a definitive ‘Yes’. Back in 2014, when network automation for physical networks was still in its infancy, we faced a steep learning curve….
The expanding domain of Generative AI is quickly becoming a pivotal aspect of enterprise innovation, with an impressive 66% of participants in IDC’s Future Enterprise Resiliency and Spending Survey recognizing its critical role. As this landscape evolves, the management of AI workloads across diverse environments becomes a crucial challenge, with issues such as latency, security, and connectivity at the forefront. In our view, a hybrid and multicloud networking strategy is not merely an option but a necessity for enterprises that leverage AI and operate within…
This is the second in a series of blog posts by the ONUG Collaborative AI-Driven NOC/SOC Automation project team, which is delving into Generative AI (GenAI) use cases for streamlining and automating NOC/SOC workflows. Here we describe several different approaches for adopting GenAI-enabled applications and tools and conclude by considering the benefits and challenges of each approach. AI is playing an increasingly important role in network and security operations. The use of machine learning to analyze and derive actionable intelligence from large volumes of operational…
You can’t escape it: artificial intelligence is defining the current era of technology whether we like it or not. Although AI has been around in various forms for decades (the concept of a neural network model was first proposed in the 1940s!), the transformer model architecture, recent advent of LLMs, and significant advancements in computational hardware have armed us with AI capabilities that have never been seen before. All of a sudden, computers are capable of complicated natural language processing tasks requiring long-range context, allowing…