AI-Powered SD-WAN Operations and Analytics
Introduction
Imagine your network could predict a path failure before it affects a single user, automatically recommend a better route, and explain its reasoning in plain English. That is no longer a futuristic vision -- it is what AI SD-WAN delivers today. As organizations adopt multi-cloud architectures, SaaS applications, and support users working from anywhere, the complexity of wide-area networking has grown exponentially. Network operations teams face relentless pressure to maintain visibility into traffic patterns, ensure application performance, and troubleshoot issues across distributed fabrics that span branches, campuses, data centers, and cloud environments.
Traditional reactive monitoring -- waiting for alarms, manually inspecting dashboards, and correlating data across siloed tools -- simply cannot keep up. The shift toward AI-powered networking transforms IT operations from a reactive posture into a predictive and proactive model. Catalyst SD-WAN now integrates analytics, machine learning forecasting, anomaly detection, predictive path recommendations, and even Large Language Model (LLM) based assistants directly into the SD-WAN Manager platform.
This article provides a comprehensive technical walkthrough of every AI and analytics capability available in the Catalyst SD-WAN ecosystem. You will learn how the SD-WAN Manager Overview Dashboard delivers operational insights at a glance, how Network-Wide Path Insights (NWPI) enables end-to-end troubleshooting, how Retrieval Augmented Generation (RAG) lets you chat with your SD-WAN fabric, and how AIOps features like predictive path recommendations, bandwidth forecasting, and anomaly detection work under the hood. We will also explore the REST API framework that makes all of this programmable and automatable. Whether you are preparing for a certification exam or managing a production SD-WAN deployment, the concepts and techniques covered here will sharpen your operational skills.
Why AI SD-WAN Analytics Matter for Modern Networks
The modern enterprise WAN is fundamentally different from what it was even five years ago. Three major shifts have created a perfect storm of operational complexity:
-
Multi-cloud and SaaS adoption: Applications like Microsoft 365, Webex, Google Workspace, Salesforce, and others are consumed as cloud services. Traffic no longer flows predictably from branch to data center -- it breaks out locally to the internet, backhaults through hubs, or traverses SSE (Security Service Edge) nodes.
-
Users working from anywhere: Teleworkers, remote users, branch offices, and SMB sites all connect through the SD-WAN fabric. Each user's traffic may take a completely different path depending on policies, circuit availability, and real-time conditions.
-
Untrusted transport diversity: SD-WAN fabrics typically overlay across MPLS, broadband internet, LTE/5G, and private circuits. Each transport has different performance characteristics that change throughout the day.
Network operations teams need extensive visibility into network and traffic patterns to manage this complexity. They must use digital and application experience monitoring along with historical trends to simplify IT operations. This is exactly where SD-WAN analytics steps in -- providing the data foundation that AI and ML models consume to deliver actionable intelligence.
The Analytics and Insights Portfolio
Catalyst SD-WAN delivers a comprehensive analytics portfolio that spans several categories:
| Capability | Description |
|---|---|
| Application Experience | Visibility into network and application performance across all sites |
| Historical Trends | Daily, weekly, and monthly aggregated data for trend analysis |
| Scheduled Reports | Automated reporting on application and network health |
| Traffic Flow Patterns | Understanding how traffic distributes across circuits and paths |
| End-to-End Path Visualization | Complete path tracing from source to destination |
| Troubleshooting | Extensive traffic analysis with deep drill-down capabilities |
| App Distribution Across Circuits | Visibility into which applications use which transports |
| Predictive Path Recommendations | AI-driven path optimization suggestions |
| SaaS Traffic Optimization | Intelligent routing for cloud application traffic |
| Bandwidth Forecasting | ML-based capacity planning and usage prediction |
| Anomaly Detection | Automated identification of unusual network patterns |
| Adaptive and Predictive Networking | Proactive network optimization based on learned behaviors |
Pro Tip: The Converged SD-WAN Manager Overview Dashboard (available from release 20.15) gives you a snapshot view of your overall application experience and network health -- including sites, tunnels, and circuits -- in a single landing page. Use it as your first stop for operational awareness before diving into deeper analytics.
How Does the SD-WAN Manager Overview Dashboard Work?
The SD-WAN Manager Overview Dashboard serves as the operational command center for Catalyst SD-WAN. Accessible as the landing page of the manager interface, this dashboard provides several critical capabilities:
- Snapshot view of overall application experience and network health, covering sites, tunnels, and circuits
- Top applications identification across the entire network at a glance
- Time-period comparison allowing you to compare current performance against previous periods
- Quick operational insights into application and network health without navigating through multiple screens
The navigation path is straightforward: the Overview Dashboard is the landing page of the SD-WAN Manager. From here, network operators can quickly identify which applications are consuming the most bandwidth, which sites are experiencing degradation, and which tunnels or circuits require attention.
The dashboard consolidates data that would otherwise require visiting dozens of individual device pages, making it an essential starting point for day-2 operations. When combined with the deeper analytics tools described in the following sections, it creates a layered approach to network visibility -- from high-level health at a glance down to individual flow-level troubleshooting.
What Is Network-Wide Path Insights (NWPI) in AI SD-WAN?
Network-Wide Path Insights (NWPI) is one of the most powerful troubleshooting and visibility tools in the Catalyst SD-WAN platform. It acts as a confidant for SD-WAN operations, providing three core capabilities:
- True real-time application performance visibility -- see how applications are actually performing across the fabric, not just what the control plane reports
- Design validation with confidence -- verify that your SD-WAN design is working as intended by tracing actual traffic paths
- Simplified troubleshooting with Insight Readouts -- get human-readable explanations of what is happening to traffic, including integrations with ISE and ThousandEyes
How NWPI Works Under the Hood
The NWPI mechanism follows a well-defined sequence:
- The network operator creates a trace on the SD-WAN Manager, specifying the site, VPN, and filter criteria for the traffic of interest
- The SD-WAN Manager instructs the first router in the path to write NWPI metadata into the SD-WAN header of matching packets
- Subsequent routers in the path read the NWPI metadata and use it to send flow information back to the SD-WAN Manager
- The SD-WAN Manager correlates all the per-hop data into a single unified view, providing end-to-end visibility
This approach is elegant because it leverages the existing SD-WAN data plane encapsulation. The NWPI metadata travels alongside the original packet through the fabric, and each node along the path reports its local perspective -- including data policy decisions, QoS queueing behavior, DSCP marking, loss, delay, and jitter measurements.
Starting a NWPI Trace
To initiate a NWPI trace, navigate to Tools > Network-Wide Path Insights on the SD-WAN Manager. From there, you define the trace parameters including the site, VPN, and any traffic filters you want to apply.
NWPI Insight Summary Views
Once a trace is running, NWPI provides several insight summary views that answer the fundamental operational questions:
| Question | Insight View |
|---|---|
| What is happening? | Overview and Event Insight |
| Who is affected? | User and application identification |
| Where is the issue? | Path-level and hop-level localization |
| When did it occur? | Timeline correlation and trend data |
| Why is it happening? | Root cause analysis with hyperlinked drill-downs |
Each insight view includes hyperlinks that help users quickly spot impacted flows in one click and drill down to a deeper understanding of the root cause.
App Performance Insight
The App Performance Insight readout provides application-centric topology and paths that are discovered by actual traffic of interest. For example, in a typical enterprise deployment, NWPI might reveal:
- Microsoft 365 traffic taking a local breakout from the branch directly to the SaaS cloud via internet (DIA -- Direct Internet Access)
- Amazon traffic being backhauled from a branch to a hub site via MPLS, then breaking out to the SaaS cloud via internet
- Real-time voice traffic flowing between branches through a hub site over MPLS
- Enterprise applications being load-balanced from a branch to multiple hub sites via both MPLS and internet, then forwarded toward the campus or data center via LAN
When performance issues are detected, the readout identifies the specific hop causing degradation. For instance, it might report "Poor performance on the hop from Hub1 to SaaS via internet -- high server network delay, score 3."
Pro Tip: NWPI does not just show you the path -- it shows you what happened to the traffic at each hop. If a firewall policy is dropping packets, NWPI will reveal the firewall drop along with the specific class-map responsible. This eliminates hours of manual troubleshooting.
NWPI Troubleshooting: A Real-World AI SD-WAN Scenario
To illustrate the power of NWPI, consider a real-world troubleshooting scenario. A user named Jack reports that he cannot access an application. Here is how NWPI enables rapid resolution:
- Identify the symptom: The NWPI Event Insight shows packet loss for Jack's traffic
- Locate the problem: NWPI pinpoints that the packet loss is occurring on a specific SD-WAN node -- Jack's traffic is being dropped at the branch router
- Determine the cause: Drilling into the flow-level insight reveals that Jack is accessing a cloud application and experiencing a local drop at the branch
- View policy and configuration: NWPI exposes the exact policy and configuration applied to the user's traffic, showing a firewall drop caused by a specific class-map rule
Without NWPI, this troubleshooting process might involve logging into multiple routers, running show commands, analyzing packet captures, and correlating timestamps across devices. With NWPI, the entire root-cause analysis happens in a single pane of glass within minutes.
NWPI Release Timeline and Feature Evolution
NWPI has evolved significantly across releases, with each version adding new capabilities:
| Release | Key Features |
|---|---|
| 17.4 / 20.4 | On-demand trace with basic filters; flow-level insight (path, DSCP, loss, delay, jitter); flow journey inside SD-WAN edge (data policy, queueing) |
| 17.6 / 20.6 | DNS domain discovery; advanced filters (ART, app visibility); app domain insight; flow-level insight advanced view; app trend and flow trend; intelligent readout for critical use cases |
| 17.9 / 20.9 | Insight summary with overview, app performance insight, event insight, QoS insight, flow-level path insight |
| 17.12 / 20.12 | Synthetic traffic for design validation; multiple VPNs trace support |
| 17.13 / 20.13 | NWPI and ISE integration; user ID grouping field; UX 2.0 global topology and NWPI integration; auto-on NWPI tasks for SLA violation and QoS congestion events |
| 17.14 / 20.14 | NWPI and ThousandEyes integration |
| 17.15 / 20.15 | NWPI support in SD-Routing mode |
| 17.16 / 20.16 | Packet capture replay |
The ISE integration (17.13/20.13) is particularly noteworthy because it allows NWPI to correlate network path data with user identity information from ISE, enabling truly user-centric troubleshooting. The ThousandEyes integration (17.14/20.14) extends visibility beyond the SD-WAN fabric to include internet and SaaS provider path segments.
Understanding AI, ML, Deep Learning, and Generative AI for SD-WAN
Before diving into the specific AIOps features, it is important to understand the AI technology stack that powers these capabilities. The hierarchy of AI technologies builds upon itself:
Artificial Intelligence (AI)
AI is a broad discipline that encompasses all aspects of machine learning and deep learning. It represents the overarching goal of creating systems that can perform tasks that would normally require human intelligence.
Machine Learning (ML)
Machine learning is a subset of AI that uses statistical models to learn from data to perform tasks without explicit programming. In the context of SD-WAN, ML models analyze historical telemetry data -- interface statistics, tunnel performance metrics, application response times -- to identify patterns and make predictions.
Deep Learning
Deep learning is a further subset that utilizes neural networks to model and interpret complex patterns in large datasets. The bandwidth forecasting capability in Catalyst SD-WAN, for example, uses neural networks as part of its forecasting model ensemble.
Generative AI
Generative AI represents the newest frontier -- AI that generates content. In the SD-WAN context, this means AI assistants that can answer questions in natural language, generate troubleshooting analysis, and provide remediation suggestions based on network data.
Large Language Models (LLMs)
LLMs are the engine behind generative AI capabilities. Key characteristics of LLMs include:
- They are designed to understand and generate content in human language
- They are trained on massive datasets through tokenization, vectorization, and fine-tuning
- They answer queries using natural language based on similarity search results
- They provide advanced features like tool calling and function calling, which enables them to interact with external systems like the SD-WAN Manager
Popular LLM frameworks and models relevant to SD-WAN AI operations include GPT (OpenAI), Gemini (Google), Llama 3 (Meta), and the Ollama framework. When selecting an LLM for network operations use cases, organizations must consider data privacy, cost, and specific use cases.
Pro Tip: Generic LLM models do not have access to real-time data, domain-specific data, or the latest data about your network. This limitation results in hallucinations -- confidently stated but incorrect answers. This is why Retrieval Augmented Generation (RAG) is essential for network operations AI.
How Does RAG Enable Chatting with Your AI SD-WAN Fabric?
One of the most exciting developments in AI-powered networking is the ability to have a natural language conversation with your SD-WAN fabric. This is made possible through Retrieval Augmented Generation (RAG), which solves the fundamental limitation of generic LLMs.
The Hallucination Problem
Generic LLM models lack access to three critical categories of information:
- Real-time data -- the current state of your network
- Domain-specific data -- the particular configuration and policies of your SD-WAN deployment
- Latest data -- recent changes, events, and performance metrics
Without access to this information, LLMs produce hallucinations -- responses that sound authoritative but are factually wrong. In network operations, acting on hallucinated troubleshooting advice could make problems worse.
How RAG Solves This
Retrieval Augmented Generation allows an LLM model to have access to external sources of data. In the SD-WAN context, RAG works as follows:
- Data collection: Configuration and operational data is collected from the SD-WAN Manager using its REST APIs
- Data transformation: The raw API data is parsed through several processing steps including data transformation and embedding generation
- Vector storage: The processed data is stored in a vector database, which enables efficient similarity-based retrieval
- Query processing: When a user asks a question, the system retrieves matching information from the vector database and provides it to the LLM
- Response generation: The LLM uses the retrieved context to create high-confidence, factually grounded responses
LangChain: The Integration Framework
LangChain serves as the orchestration framework that ties together the various components of a RAG-based SD-WAN AI assistant. Its key strengths include:
- Seamless integration of dataset imports, models, vector databases, and LLMs into a unified pipeline
- Modular design that enables easy swapping of applications, LLM models, and vector databases without rewriting the entire solution
- Streamlined development of RAG solutions, reducing the engineering effort required to build network-aware AI assistants
The NWPI Buddy: Chat with Your SD-WAN Fabric
The practical implementation of this technology is a system called the NWPI Buddy -- an AI assistant that can interact with your SD-WAN fabric through natural language. The architecture works as follows:
- A user asks a question such as "Can you help run a trace and analyze it?"
- The system uses an LLM with tool-calling capability (such as llama3-groq-tool-use) through LangChain to determine that it needs to interact with the SD-WAN Manager
- The LLM uses tool calling to start an NWPI trace on the SD-WAN Manager and collect the resulting insights
- A second LLM instance (such as llama3.3) through LangChain analyzes the collected NWPI insights
- The system provides summarized analysis and remediation suggestions based on the NWPI data
This two-model architecture is deliberate. The first model specializes in tool calling -- deciding which SD-WAN Manager APIs to invoke and in what sequence. The second model specializes in analysis -- interpreting the results and generating human-readable recommendations with suggested remedy actions.
Pro Tip: The power of the NWPI Buddy approach is that it combines the structured, deterministic data from SD-WAN Manager APIs with the natural language understanding of LLMs. The API data ensures accuracy, while the LLM provides accessibility -- network operators do not need to memorize API endpoints or parse JSON responses manually.
AIOps in Catalyst SD-WAN: Simplifying Day-2 Operations
AIOps represents the convergence of all AI and ML capabilities into a unified operations framework. The goal is twofold:
- Mitigate issues before they impact users -- and when issues do occur, offer root cause analysis to reduce Mean Time to Resolution (MTTR)
- Optimize network and application performance to achieve higher operational efficiency
The AIOps portfolio in Catalyst SD-WAN consists of four major capabilities:
| AIOps Capability | Function | Status |
|---|---|---|
| Predictive Path Recommendations | AI-driven path optimization to improve application performance | Available |
| AI Assistant for Networking | Interactive LLM-based assistant for feature and operational queries | Beta |
| Bandwidth Forecasting | AI/ML-based capacity planning and usage prediction | Beta |
| Anomaly Detection | Automated detection of network anomalies across key KPIs | Beta |
How Do Predictive Path Recommendations Improve AI SD-WAN Performance?
Predictive Path Recommendations, powered by ThousandEyes WAN Insights, represent the most mature AIOps capability in Catalyst SD-WAN. This feature transforms IT operations from a reactive model to a truly predictive one.
The Three-Step Process
-
Ingest Telemetry: The system collects network telemetry via SD-WAN Analytics. This includes performance metrics across all overlay tunnels and transports.
-
Data Analysis: Predictive modeling algorithms forecast potential issues and generate path recommendations. The system identifies when a current path is likely to degrade and calculates the estimated performance gain from switching to an alternative path.
-
Feedback Loop: Based on the recommendations, SD-WAN policies can be fine-tuned to make path changes that improve application experience. This creates a continuous improvement cycle.
Out-of-the-Box Application Groups
Predictive Path Recommendations comes with pre-configured application groups for common enterprise applications:
- Office 365
- Webex
- Google Workspace
- Salesforce
- GoTo Meeting
- Voice
These application groups work immediately without requiring custom configuration, enabling rapid time-to-value.
Licensing and Activation
A critical detail for deployment planning: Predictive Path Recommendations is included with the SD-WAN DNA Advantage+ license. Notably, a ThousandEyes Enterprise Agent is not necessary -- the capability works with embedded telemetry.
The activation process involves the TE-EMBED-WANI license:
- For brownfield (existing) customer accounts with DNA Advantage, the embed license SKU is pre-deposited in eligible accounts
- For greenfield (new) DNA Advantage deployments, the embed SKU auto-expands automatically
Reviewing and Applying Recommendations
Once activated, operators can:
- Review recommendation summaries by application group -- seeing which apps would benefit from path changes
- Click on an application group to view detailed per-site recommendations across the network
- View path performance details between two sites, including graphs showing default and recommended path quality, color-coded lines reflecting network path quality, and line charts showing loss, latency, and jitter over these paths
For example, the system might recommend switching from load balancing between "private1" and "private2" paths to exclusively using "private2" for a specific application group at a specific site, because the predictive model forecasts better performance on the dedicated path.
Closed-Loop Automation
The most advanced aspect of Predictive Path Recommendations is closed-loop automation, which automates the process of applying recommendations:
- When a recommendation is applied via closed-loop automation, the workflow creates a copy of the AAR (Application-Aware Routing) policy and a new site list with the site-id of the corresponding site
- It modifies the AAR policy sequence corresponding to the application and applies it to the site where the recommendation should take effect
- If a user makes changes to the centralized policy after deploying recommendations, this triggers a revert action -- the system removes all applied recommendations and reverts the centralized policy to its state before the first recommendation was applied
- After user changes to the centralized policy are complete, you can use the bulk option to select multiple recommendations and re-apply them all at once
Pro Tip: The revert behavior on centralized policy changes is a critical safety mechanism. It ensures that manual policy changes by administrators are never accidentally overridden by automated recommendations. Always complete your manual policy changes first, then re-apply recommendations in bulk.
AI Assistant for Networking: LLM-Powered SD-WAN Operations
The AI Assistant for Networking brings interactive, LLM-based intelligence directly into the SD-WAN Manager interface. Currently in beta, this capability requires connectivity to the cloud and serves two primary use cases:
Use Case 1: Knowledge Fetch
The AI Assistant can help with feature-related user queries by fetching information from documentation. Instead of searching through lengthy configuration guides and release notes, operators can ask natural language questions about SD-WAN features and receive contextual answers.
Use Case 2: Network Operational Queries
The assistant also handles operational queries such as health information about the SD-WAN fabric. Operators can ask about the status of specific sites, tunnels, or applications and receive real-time responses based on actual network state.
This dual capability -- documentation knowledge plus operational awareness -- makes the AI Assistant a powerful tool for both learning and day-to-day operations.
How Does SD-WAN Bandwidth Forecasting Use Machine Learning?
Bandwidth forecasting helps organizations with capacity planning by identifying growth trends, visualizing seasonality and surges in circuit usage, and predicting future bandwidth requirements. This capability, currently in beta, addresses one of the most challenging aspects of network planning -- accurately predicting when circuits will reach capacity.
Key Capabilities
- Track historical usage of individual circuits and forecast future usage for the top 50 circuits in the deployment
- Provide forecasts for several weeks into the future
- Compare historical and forecasted bandwidth usage side by side for informed decision-making
Technical Implementation
The bandwidth forecasting engine uses a sophisticated approach combining multiple techniques:
- Statistical models and machine learning techniques, including neural networks, are used to generate forecasts
- Historical interface statistics serve as the training data
- The system evaluates multiple forecasting models and selects the most suitable models based on MAPE (Mean Absolute Percentage Error) scores -- ensuring the most accurate model is used for each circuit
- The system generates a 3-month forecast horizon based on the latest 52 weeks of interface statistics data for the top 50 circuits
- As new data becomes available, the forecast horizon is continuously updated, refining models and improving long-term predictions
The use of MAPE scores for model selection is particularly noteworthy. Rather than relying on a single forecasting algorithm, the system maintains an ensemble of models and automatically selects the one with the lowest prediction error for each individual circuit. This adaptive approach accounts for the fact that different circuits may exhibit different traffic patterns -- some may be highly seasonal, others may show steady linear growth, and others may have unpredictable spikes.
SD-WAN Anomaly Detection: AI-Driven Network Health Monitoring
Anomaly detection, also currently in beta, provides automated identification of unusual patterns in network behavior. Its benefits include:
- Early problem detection: Identify unusual patterns before they manifest into larger issues that affect users
- Network performance optimization: Fine-tune your network based on detected anomalies to improve end-user experience
- Tunnel anomaly identification: Identify tunnels with anomalies across key network KPIs including loss, latency, and jitter
- Impact radius determination: Assess the scope of an anomaly based on site count, usage levels, and application count
- Chronic issue identification: Determine if an anomaly is a one-time event or a chronic issue by viewing trend information
The impact radius assessment is particularly valuable for prioritizing responses. An anomaly affecting a single tunnel at a low-usage site with few applications is far less urgent than one affecting multiple tunnels at a high-traffic site serving critical applications. The anomaly detection system quantifies this impact to help operations teams triage effectively.
SD-WAN APIs: The Programmability Foundation for AI-Powered Networking
All of the AI and analytics capabilities discussed so far rely on a robust API framework. The SD-WAN Manager uses a REST architecture -- a stateless, client-server, cacheable communications protocol. It also exposes webhooks for real-time event-driven integration with third-party applications. For device-level configuration, the manager uses NETCONF to configure and manage edge devices.
API Documentation and Testing
The SD-WAN Manager includes built-in API documentation accessible at:
https://<vmanage-ip:port>/apidocs
This Swagger-based interface allows you to explore available APIs, understand their parameters, and execute test API calls directly from the browser.
URI Structure
Understanding the SD-WAN Manager API URI structure is essential for programmatic access:
https://vmanage-ip:port/dataservice/device/bfd/state/device?deviceId=1.1.1.7
Breaking this down:
| Component | Example | Purpose |
|---|---|---|
| Protocol | https:// | Encrypted transport between client and server |
| Server/Host | vmanage-ip:port | Resolves to the IP and port of the SD-WAN Manager |
| Resource (URI) | /dataservice/device/bfd/state/device | The location of the data or object of interest |
| Parameters | ?deviceId=1.1.1.7 | Details to scope, filter, or clarify a request (often optional) |
API Categories
SD-WAN Manager APIs are organized into logical resource collections:
| API Category | Use Cases |
|---|---|
| Device Action | Trigger actions on managed devices |
| Device Inventory | Retrieve device information and status |
| Configuration | Manage templates and device configurations |
| Certificate Management | Handle device and controller certificates |
| Monitoring | Access health and performance data |
| Real-Time Monitoring | Get live operational data from devices |
| Multi Cloud | Manage cloud connectivity and integrations |
| Administration | Platform management and user administration |
Practical API Use Cases
The SD-WAN analytics APIs support several practical automation scenarios:
-
Device and Monitoring APIs: Retrieve inventory data from the SD-WAN Manager including control connections, OMP peers, BFD sessions, and system status of WAN edge routers using CLI-based Python application scripts
-
Configuration APIs: Retrieve lists of templates and policies, activate or deactivate a policy, and edit preferred color in specific Application-Aware Routing policy sequences
-
App Route Statistics APIs: Use aggregation query APIs to retrieve Application-Aware Routing statistics (BFD statistics) for overlay tunnels and create reports of average latency, loss, jitter, and vQoE scores
-
Alarms APIs: Use simple query APIs to retrieve alarms of specific categories and get detailed information about the consumed events that created specific alarms
-
Webhooks: Enable webhooks on the SD-WAN Manager and write API routes to consume data sent from the SD-WAN Manager to a webhook server for real-time event processing
Lab Topology Example
A typical SD-WAN API lab environment includes:
SD-WAN Manager System-IP: 100.0.0.1/32
Controller System-IP: 100.0.0.101/32
Validator System-IP: 100.0.0.101/32
Hub1-onprem System-IP: 10.0.0.1/32
Hub2-SIG System-IP: 10.0.0.2/32
Spoke1 System-IP: 10.0.0.3/32
Spoke2 System-IP: 10.0.0.4/32
Spoke3 System-IP: 10.0.0.5/32
Spoke4 System-IP: 10.0.0.4/32
With service-side subnets of 172.16.1.0/24 and 172.16.2.0/24, and connectivity over both MPLS and internet transports.
Pro Tip: The SD-WAN Manager API is the enabler that makes RAG-based AI assistants possible. When an LLM uses tool calling to interact with your SD-WAN fabric, it is invoking these same REST APIs under the hood. Understanding the API structure helps you build custom AI integrations tailored to your operational workflows.
Frequently Asked Questions
What is the difference between NWPI and traditional SD-WAN monitoring?
Traditional SD-WAN monitoring relies on per-device metrics and control-plane data. NWPI goes further by injecting metadata into the actual SD-WAN data plane header, which means every router in the path reports what happened to the traffic from its local perspective. This provides true end-to-end, flow-level visibility including data policy decisions, QoS queueing behavior, DSCP marking, and per-hop loss, delay, and jitter. Traditional monitoring might tell you a tunnel has high latency; NWPI tells you exactly which hop is causing it and why.
Do I need a ThousandEyes license for Predictive Path Recommendations?
No. Predictive Path Recommendations is included with the SD-WAN DNA Advantage+ license. A ThousandEyes Enterprise Agent is not necessary. The TE-EMBED-WANI license is pre-deposited in eligible brownfield customer accounts with DNA Advantage, and auto-expands for greenfield DNA Advantage deployments. However, if you also deploy ThousandEyes agents, the NWPI integration (available from release 17.14/20.14) provides additional visibility into internet and SaaS provider path segments.
How far into the future can bandwidth forecasting predict?
The bandwidth forecasting capability generates a 3-month forecast horizon based on the latest 52 weeks of interface statistics data for the top 50 circuits in your deployment. As new data becomes available, the forecast horizon is continuously updated, and the system refines its models to improve long-term prediction accuracy. The system uses multiple statistical and ML models, including neural networks, and selects the most accurate model for each circuit based on MAPE (Mean Absolute Percentage Error) scores.
Can the AI Assistant for Networking make changes to my network?
The AI Assistant for Networking, currently in beta, serves two primary use cases: answering feature-related questions by fetching documentation and responding to operational queries about network health. It requires connectivity to the cloud. The NWPI Buddy concept demonstrates a more advanced use case where an LLM uses tool calling to start NWPI traces and collect insights, then provides analysis and remediation suggestions. However, applying changes through closed-loop automation is a separate, controlled process under Predictive Path Recommendations that includes safety mechanisms like automatic revert on policy changes.
What network KPIs does anomaly detection monitor?
The anomaly detection capability identifies tunnels with anomalies across three key network KPIs: loss, latency, and jitter. Beyond individual metrics, it also determines the impact radius of each anomaly based on site count, usage levels, and application count. It can further identify whether an anomaly is a one-time event or a chronic issue by analyzing trend information, helping operations teams prioritize their response efforts.
What types of APIs does the SD-WAN Manager expose?
The SD-WAN Manager exposes REST APIs organized into categories including Device Action, Device Inventory, Configuration, Certificate Management, Monitoring, Real-Time Monitoring, Multi Cloud, and Administration. It also supports webhooks for event-driven integration with third-party systems. For device-level configuration management, it uses NETCONF. All API documentation is built into the manager and accessible at the /apidocs endpoint, which provides a Swagger UI for exploring and testing API calls.
Conclusion
The integration of AI and machine learning into Catalyst SD-WAN represents a fundamental shift in how wide-area networks are operated. From the Overview Dashboard that provides instant operational awareness, through NWPI's deep flow-level troubleshooting, to the AIOps suite of predictive path recommendations, bandwidth forecasting, anomaly detection, and LLM-powered AI assistants -- every layer of the platform is becoming more intelligent and more automated.
The key takeaways from this deep dive into AI SD-WAN operations and analytics:
- NWPI provides unmatched end-to-end visibility by embedding metadata in the SD-WAN data plane, enabling flow-level troubleshooting that answers the what, who, where, when, and why of every network issue
- RAG and LangChain make it possible to build AI assistants that combine the accuracy of SD-WAN Manager API data with the accessibility of natural language interaction
- Predictive Path Recommendations transform operations from reactive to predictive, with closed-loop automation that can apply path optimizations while maintaining safety through automatic revert mechanisms
- Bandwidth Forecasting uses ensemble ML models with automatic model selection based on MAPE scores to deliver accurate 3-month capacity predictions
- Anomaly Detection quantifies impact radius to help teams prioritize responses based on affected sites, usage, and applications
- SD-WAN REST APIs form the programmable foundation that enables all AI-driven capabilities, from automated data collection to LLM tool calling
As these capabilities mature from beta to general availability, the role of the network engineer will continue to evolve -- from manually configuring and troubleshooting individual devices to orchestrating intelligent, self-optimizing networks. Mastering these AI-powered tools now positions you at the forefront of next-generation network operations.
Explore the full range of SD-WAN, AI, and network automation courses on NHPREP to build hands-on skills with these technologies and advance your career in AI-powered networking.