Spring 2026 Archives - ONUG

Inside eBay’s AI-Powered Network: SONiC, Automation, and AI-Assisted NOC Operations

by Guest Author

March 10, 2026

eBay operates one of the most advanced enterprise networks built on SONiC and large-scale automation. In this technical deep dive, Rick Casarez provides an inside look at the architecture, design principles, and operational decisions behind eBay’s production SONiC deployment. Rick will also demonstrate how eBay is applying AI-powered LLMs to assist and automate NOC workflows, enabling engineers to investigate incidents faster, interpret telemetry, and streamline troubleshooting runbooks. The session concludes with a look at how eBay’s network architecture is evolving to support AI workloads and…

Networking for AI Workloads: Making the Network AI-Aware – Marvell Semiconductor, Inc. Triple-T

by Guest Author

March 4, 2026

Networking plays a critical role in the accelerated compute fabric. Critical features to enable high performance networking for AI workloads are being defined and ratified in UEC and ESUN (OCP). Marvell is a contributor to both these industry standards and has developed a full portfolio of Scale-out and Scale-up switches powered by SONiC to enable best in class AI fabrics.

From Latency to Learning: Adapting Campus Fabrics for LLM Inference via SONiC – Celestica Triple-T

by Guest Author

March 4, 2026

As campus networks transition from general connectivity to AI-ready infrastructure, the surge in LLM inference traffic demands a fundamental architectural shift. Localized AI applications require high throughput and ultra-low latency that legacy systems weren’t built to handle. This session explores how SONiC’s open-source framework provides the telemetry, load balancing, and agility needed to adapt campus fabrics for heavy inference workloads. Join us for a deep dive into the foundational capabilities of SONiC that turn bandwidth-heavy AI demands into seamless learning experiences.

From AI Ambition to AI Infrastructure: Your Blueprint to Success for Enterprise Networks – Cisco Triple-T

by Guest Author

March 4, 2026

We know most Enterprises aren’t running tens of thousands of GPUs (yet!), but AI is quickly accelerating every industry. Before you can run AI, you need a network built to carry it. That means your network must be smart, secure, and ready to evolve. This session cuts through the complexity with a practical look at how enterprises can get AI infrastructure up and running without disrupting existing workloads. Learn about what it takes to run both traditional and AI workloads seamlessly, with unified operations, automation,…

Reinventing AI Networking for Performance, Openness, and Scale – Upscale AI

by Guest Author

March 4, 2026

AI networking has become one of the industry’s most critical bottlenecks, limiting the speed, scalability, and efficiency of modern AI training and inference workloads. As clusters grow, operators require deterministic latency, extreme bandwidth, and open architectures that foster flexibility, interoperability, and rapid innovation. Upscale AI will explore how SONiC, an open-source network operating system, enables scalable and efficient AI networking. Upscale AI will also discuss practical strategies for adopting SONiC across the AI stack, and how cohesive, low-latency system design optimizes performance, accelerates deployment, and…

Making AI Multitenancy Measurable: A Practical Approach to Job Completion Time SLAs – Keysight Triple-T

by Guest Author

March 4, 2026

In multitenant AI clusters, bandwidth is not the SLA—Job Completion Time (JCT) is. In this joint session, NextHop.ai and Keysight present a real-world study of how load balancing modes, ECN thresholds, and congestion control tuning directly impact JCT in shared AI Ethernet fabrics. Using production-class switching and traffic emulation, we demonstrate how overly conservative settings can double completion times and waste GPU cycles—and how systematic tuning makes JCT predictable. Attendees will gain a practical framework for designing and validating AI networks around measurable performance outcomes,…

Securing Agentic AI: From Insight to Oversight – Netskope Triple-T

by Guest Author

March 4, 2026

AI has evolved from passive insights to active decision-making—drafting emails, generating code, and executing tasks across enterprise systems. As AI agents grow more powerful, they also introduce unpredictability, expanding attack surfaces and compliance risks. In this session, Netskope shows how to move beyond simply enabling AI to securing it at every touchpoint. We’ll outline a practical blueprint to protect AI agents end-to-end across users, applications, and data. Learn how to maintain visibility into agent behavior, apply advanced DLP and threat protection to reduce misuse, and…

AI Recommends, Automation Executes: A Practical Framework for GenAI + Deterministic Automation in Network Ops – Network to Code Triple-T

by Guest Author

March 4, 2026

GenAI is powerful in network operations—but it’s probabilistic by nature, meaning it produces “most likely” answers that can vary with small prompt changes, making it risky as a direct control mechanism for repeatable, auditable change workflows. In contrast, deterministic automation (RPA in ops contexts) is built for consistent execution: enforcing config standards, running verified OS upgrade workflows, and remediating drift with predictable outcomes. In this session, you’ll get a simple decision framework for when to use GenAI vs. deterministic automation, why trusted data (a network…

Distributed Computing @ Scale for AI Training & Inference – Main Stage Keynote

by Guest Author

March 3, 2026

As AI models continue to scale, both training and inference are growing rapidly in operational importance. Training pushes the limits of compute density and interconnect scale, while inference now dominates production workloads. Together, these forces are reshaping AI system architectures. Meeting these demands requires a next-generation networking fabric that can: Scale up within and across a small number of racks to tightly couple XPUs for high-throughput training and low-latency inference Scale out across entire data centers using flat, high-performance topologies that support large-scale training and…

Industry Keynote Presented by Riverbed

by Guest Author

March 2, 2026