Run Your SD-WAN Deployment with Network AI

Run Your SD-WAN Deployment with Network AI

SD-WAN is seemingly taking the enterprise world by storm. The market is big and still growing, with room for at least five major vendors doing a healthy business. And it’s easy to see why. SD-WAN eases a lot of headaches in running a large enterprise WAN. The reliability, redundancy, and security benefits are real.

I was tempted to say that SD-WAN “solves” a lot of WAN headaches, and SD-WAN vendors would have you believe this is really true – that it’s a set-and-forget kind of technology. Anyone who has spent time running a network knows this to be false though. Networks are complex systems where change is constant.  SD-WAN isn’t magic, and those deployments require monitoring, maintenance, and troubleshooting just like any other network technology.

Let’s talk about two specific real-world problems that we’ve seen in customer deployments — though these certainly aren’t the only issues that operations teams might encounter.

  1. Persistent underlay performance problems. Service provider networks can of course have outages, loss, latency — at a single site, or across an entire region. If one leg of your underlay connectivity is continuously compromised, the reliability benefit of your SD-WAN deployment has evaporated. And worse, SD-WAN vendors’ built-in tooling provides little in the way of automated analysis to detect these issues.
  2. Control plane instability. The control plane is obviously critical to SD-WAN reliability. Control plane state changes on individual devices may be just noise or could be indicative of long-term instability or impending failure. Most SD-WAN products either do not provide any deep visibility into the health of the control plane or will only let you view logs on a per device basis. Either way, it’s not feasible to discover impending issues.

As mentioned above, SD-WAN built-in tooling and visibility is not really up to the task of detecting and resolving these problems. Many SD-WAN products provide graphs for many metrics, but without automated analysis and detection, there are simply too many graphs and metrics to have any hope of discovering problems that way.

SD-WAN products do often provide good telemetry streams, however. Syslog and flow data in IPFIX or proprietary SD-WAN vendor formats can be the basis for powerful detection capabilities, and that’s where Augtera’s Network AI comes in. By consuming this (and other telemetry) Augtera can autonomously build ML-based models of network behavior and find operationally relevant anomalies that indicate trouble brewing in the SD-WAN deployment.

Of course, there are many systems that can consume network data and provide “alerts” but none have really passed the “operationally relevant” bar.  Just as an example, setting global static thresholds for latency or loss will produce a noisy stream of irrelevant alerts, or miss events that have a real impact on application performance – or both of those at the same time. Real-time machine learning has a huge advantage over past approaches in producing meaningful notifications for operations teams.

Whether you’re considering SD-WAN or have already deployed it, know the benefits, but don’t fall victim to the fallacy that it’s magic.  For true success, your network management strategy and technology must evolve as well.

Learn more, watch the webinar SD-WAN: What if It Breaks or Underperforms?

Author's Bio

Jim Meehan