Why has network automation failed to live up to expectations? After more than two decades of effort, millions of man-hours, and hundreds of tools created, network automation has largely not achieved its intended objective of eliminating manual operations. Multiple industry association surveys (e.g., Enterprise Management Association, Gartner) show that most enterprises have automated less than half of their data center tasks. Why is this so? When asked, most IT network operators would say that “automation” should help them do manual, repetitive tasks faster. But fast…
In today’s dynamic network environments, traditional MELT telemetry (metrics, events, logs, and traces) falls short when uncovering “unknown unknowns”—unforeseen issues that can cripple network performance and endanger network security. Packets and metadata generated from the packets data offer a more granular, real-time view, providing the level of detail necessary to identify these undetected anomalies and veiled vulnerabilities. The Shortcomings of MELT Telemetry While MELT telemetry can alert you to “known unknowns”—anticipated issues you’ve prepared for—its reliance on aggregation and predefined triggers leaves it blind to unexpected network events or…
On Oct 23-24 in New York City, we’ll be at the ONUG AI Networking Summit Fall 2024. Join us at the show and talk to Antonio Nearly, an LLM-powered holographic recreation of HPE CEO Antonio Neri that showcases an array of cutting-edge AI technologies. Antonio Nearly is highly knowledgeable about everything HPE, and you can ask him about anything else too — from professional soccer teams to blue whales. All you have to do is press a button and speak your question into a mic. It works…
Out-of-band monitoring techniques are necessary for AI clusters to provide trustworthy inferences. Out-of-band solutions provide latency analysis, decrease points of failure, and do not add additional burden on the network. [1] For AI clusters, the result of high latency are erroneous inferences. AI clusters are nodal network of GPUs that store and process inferences from machine learning models. [2] A slow latency for AI clusters inference is enough to produce incorrect inferences. The nodes of an AI cluster can create incomplete calculations on requests due to…
I recently came across an insightful blog by Bruce Davie in The Register that sheds light on the advancements in virtual network automation. While his analysis of virtual networks is insightful, he raises an important question about the state of physical network automation, “Are we actually any closer to the automation of networking than we were a decade ago?” My answer is a definitive ‘Yes’. Back in 2014, when network automation for physical networks was still in its infancy, we faced a steep learning curve….
The expanding domain of Generative AI is quickly becoming a pivotal aspect of enterprise innovation, with an impressive 66% of participants in IDC’s Future Enterprise Resiliency and Spending Survey recognizing its critical role. As this landscape evolves, the management of AI workloads across diverse environments becomes a crucial challenge, with issues such as latency, security, and connectivity at the forefront. In our view, a hybrid and multicloud networking strategy is not merely an option but a necessity for enterprises that leverage AI and operate within…
The network, at the very heart of the digital transformation, is the catalyst behind the rapid adoption of the cloud and the exponential growth of AI. It’s under immense pressure to meet the ever-increasing demand for scale, agility, and cost-effectiveness. Over the past two decades, we’ve witnessed significant shifts in networking approaches. It’s crucial to understand our past, avoid past mistakes, and build on our successes to navigate the future. The physical networks are the traditional model of connectivity. In this model, an organization procures…
As we broaden the possibilities of our digital world, the blend of artificial intelligence (AI) and automation is carving out new paths in how we manage and fortify networks. This transformative duo holds immense potential to empower teams, revolutionize roles, and navigate challenges with unprecedented efficiency and agility. The recent surge in AI growth has heralded a monumental transformation in network management and monitoring. This surge is not merely a trend but a profound shift in how networks operate, with far-reaching implications for businesses across…
In this era of hyper-distributed infrastructure where our users, apps, and data are everywhere, network complexity is often a barrier to maintaining and improving application performance. Businesses want to leverage the capabilities of cloud architectures and environments to improve operational outcomes. In a world where slow is the new down, a network that just keeps up with the business is not enough. More than ever, networks impact application performance and influence business outcomes from app velocity to operational efficiency and new service deployment at scale….
Leveraging CDN Principles Supercharges your WAN Performance In the world where the CIOs have a ‘Cloud First / SaaS First’ manifesto, new applications and services will be prioritized for cloud deployment; existing on-prem applications will be candidates for migration to SaaS or Cloud-hosted alternatives, and cost savings will be a key driver. Where scalability and speed to deliver is in constant demand, the job of an enterprise infrastructure team is to deliver the (increasingly Cloud hosted) applications to their intended end-users in the most performant,…