Keeping Your AI Network Healthy – A Compute Driven Health-Check Design Proposal – Celestica Triple-T

Know your AI GPU cluster health. Given the expected trajectory of complex AI compute systems, it is imperative to know the health of your system. It is also imperative not to manage this responsibility in isolation, by the network elements alone. Join this session and learn new compute-triggered telemetry concepts and health management for future AI clusters.

Speakers:

Paul Lysander is on the product management team in the Hardware Platform Solutions group at Celestica. His primary responsibility is the SONiC platform. Prior to this role, he has held various roles on product teams responsible for network virtualization and enterprise networking.

Avinash Natarajan is part of the Engineering team in the Hardware Platform Solutions Group at Celestica. His primary focus areas are Networking, Compute, and Orchestration. Prior to this, he has held multiple roles in data networking centric product teams, a big chunk of it with Force10 Networks/Dell Networking.

Related events


There are currently no upcoming events