Learn about the critical role of the cloud network in serving AI models for inference. This session will provide an overview of the connectivity requirements, AI specific network network optimizations and design patterns that enable AI inference with minimal latency and optimal use of GPU/TPU accelerators. We will discuss a reference architecture that delivers a unified multi-model inference endpoint with scale, resiliency, API governance and responsible AI.
Victor Moreno specializes in Cloud Networking Solutions and Architectures at Google Cloud,
where he optimizes Google Cloud’s networking products and architectures to enable customer
outcomes as they transition to Cloud and adopt AI. Victor is a data networking veteran with
over 25 years of experience, has designed and implemented expansive data networks and
developed innovative networking technologies. His expertise covers diverse IT operations and
industries, from medium-sized businesses to global networks and data centers. He is also a
respected author of numerous books, papers, and standard specifications on network design,
architectures and protocols.
Register now and receive exclusive access to ONUG content and updates
Register Here