Lesson 1 of 5

NSO Architecture

Objective

Understand the architecture of a Network Services Orchestrator (NSO) deployment: the Configuration Database (CDB), Device Manager, Service Manager, and Network Element Drivers (NEDs). You will validate basic northbound/southbound connectivity and run example scripts that query NSO and a managed device. This matters in production because NSO is often used as the single source of truth and transaction manager for multi-vendor device configuration changes — for example, orchestrating an OS upgrade across a set of routers while ensuring atomic rollback on failure.

Real-world scenario: In a service provider PoP or enterprise data center, NSO sits on a management network and manages many devices via NETCONF/RESTCONF or CLI NEDs. When you trigger a service change (VPN, VLAN, software upgrade), NSO writes to the CDB, maps the service to device configs via service templates and NEDs, and pushes changes to devices in a transactional way so partial failures can be rolled back.


Topology & Device Table

Network Topology Diagram

Key Concepts (theory + protocol behavior)

  • Configuration Database (CDB) — The CDB is NSO’s authoritative, transactional data store. When a service is instantiated, NSO writes the intended configuration to the CDB first. This enables atomic transactions: either the whole multi-device change is committed, or NSO rolls back to the previous state on failure. Think of the CDB like a bank ledger — you stage a multi-account transfer, then commit or abort.

  • Service Manager — The service manager defines high-level services (for example: L3VPN, VLAN, OS upgrade) and maps them to device-level configuration templates. When you create a service instance, the service manager generates device-specific config fragments that are written to the CDB and then pushed to devices.

  • Device Manager & NEDs (Network Element Drivers) — The device manager handles communication to each device using a NED, which is a device-specific adapter that knows how to translate the abstract configuration into device-specific CLI, NETCONF/YANG, or RESTCONF operations. NEDs are responsible for device semantics and for learning device state when reconciling with the CDB.

  • Southbound protocols (NETCONF, RESTCONF, CLI exec) — NSO uses protocols such as NETCONF/RESTCONF to manage devices where possible (YANG models, XML/JSON over SSH or HTTPS). For devices that expose only CLI, NSO uses a CLI NED to run commands (the slides refer to an “exec any” feature where NSO can execute CLI commands on the device via the NED). Protocol-level detail: when using NETCONF, NSO performs and operations and relies on candidate/commit semantics if the device supports them.

  • State synchronization (reconciliation) — NSO periodically syncs device state into the CDB (or on demand). This allows NSO to detect configuration drift and provides accurate input for service generation. Reconciliation is performed by the device manager using the NED to query device configuration and operational state.

Analogy: Think of NSO as a bank's central transaction engine. The CDB is the ledger; the service manager is a teller translating customer requests into ledger entries; NEDs are the interfaces to different branch systems (each branch has different systems, i.e., different vendor devices).


Step-by-step configuration

Each step below contains the exact commands, explanation, why it matters, and verification with expected output.

Step 1: Verify management-plane connectivity from NSO to devices

What we are doing: Confirm the NSO server can reach each managed device on the management network. If management connectivity fails, NSO cannot perform NETCONF/RESTCONF or CLI operations.

# From the NSO server (lab.nhprep.com)
ping -c 3 192.0.2.11
ping -c 3 192.0.2.12

What just happened: Each ping verifies basic IP connectivity and that the devices respond to ICMP. This is the first troubleshooting step — many orchestration failures are simple network reachability issues (firewall, wrong mgmt IP, cable).

Real-world note: In production, your management network is often isolated and firewalled; ensure NETCONF/SSH and RESTCONF/HTTPS ports are allowed, not just ICMP.

Verify:

# Expected ping output (example for 192.0.2.11)
PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=1.23 ms
64 bytes from 192.0.2.11: icmp_seq=2 ttl=64 time=1.18 ms
64 bytes from 192.0.2.11: icmp_seq=3 ttl=64 time=1.20 ms

--- 192.0.2.11 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 1.183/1.205/1.233/0.025 ms

Step 2: Retrieve NSO version using a provided Python script (RESTCONF)

What we are doing: Invoke a Python script that calls NSO’s RESTCONF API to query the installed NSO version and basic status. This confirms the NSO northbound API is working and the server is up.

# On NSO server (example script invocation)
python3 get_nso_version.py --host lab.nhprep.com --user admin --password Lab@123

What just happened: The Python script issues an HTTPS RESTCONF request to NSO’s API endpoint and parses the JSON response to present the NSO version and status. This demonstrates how automation tools interact with NSO using standard RESTCONF.

Real-world note: Automation scripts should include retries and proper error handling; in production, unexpected transient errors (time-outs) are common when querying controllers.

Verify:

# Expected output (JSON printed by the script)
{
  "nso": {
    "version": "5.6.0",
    "mode": "production",
    "uptime": "12 days, 4:33:12"
  }
}

Step 3: Retrieve device OS/version via NSO using "exec any" (CLI NED) through a Python script

What we are doing: Use a Python script that leverages NSO’s capability to run a CLI command (e.g., "show version") on a managed device via the device NED (the “exec any” feature). This shows how NSO can collect operational data from devices, even when NETCONF/YANG is not available.

# On NSO server
python3 get_device_version.py --host lab.nhprep.com --device XR-R1 --user admin --password Lab@123

What just happened: The script invokes NSO’s northbound API to request execution of a CLI command on XR-R1. NSO’s device manager uses the device’s NED to open a session (SSH) to the device and execute the command; the output is returned to the script. This illustrates the NED translating a high-level request into device-specific execution.

Real-world note: Using CLI NEDs for operational commands is useful for legacy devices; however, parsing CLI output can be brittle compared to structured NETCONF/YANG data.

Verify:

# Expected output (example of 'show version' returned by NSO)
Device: XR-R1 (192.0.2.11)
Cisco IOS XR Software, Version 7.5.1
Build Information:
  Built by: build_user
  Built on: 2024-11-04-12:33:00
System uptime is 25 weeks, 3 days, 4 hours, 12 minutes

Step 4: Inspect devices known to NSO (device inventory) via RESTCONF

What we are doing: Query NSO’s device inventory to see which devices are registered and managed. This shows the device manager’s view and which NEDs are bound to each device.

# Example RESTCONF curl call on NSO server (requires credentials)
curl -s -k -u admin:Lab@123 https://lab.nhprep.com/restconf/data/devices | python3 -m json.tool

What just happened: The curl call retrieves the devices list from NSO's RESTCONF data model; the output shows device names, management IPs, and NED types. This lets you verify that XR-R1 and XR-R2 are present and which NED is used for each.

Real-world note: Device inventory is the authoritative list NSO uses for orchestration; missing devices mean services targeting those devices will fail.

Verify:

# Expected JSON output (abridged but complete keys shown)
{
  "devices": {
    "device": [
      {
        "name": "XR-R1",
        "address": "192.0.2.11",
        "port": 22,
        "authgroup": "default",
        "device-type": "cli-ned"
      },
      {
        "name": "XR-R2",
        "address": "192.0.2.12",
        "port": 22,
        "authgroup": "default",
        "device-type": "netconf-ned"
      }
    ]
  }
}

Step 5: Explain how a service push flows through NSO (no destructive changes)

What we are doing: Walk through (conceptually) the sequence when NSO instantiates a service and pushes configs to devices. No destructive commands are run in this lesson.

# Conceptual sequence (no CLI to run):
1. User submits a service instance request (northbound API / CLI).
2. Service manager transforms the request into device config fragments.
3. Fragments are written to the CDB in a transactional manner.
4. Device manager and NEDs push the changes to each device (NETCONF/CLI).
5. On success, NSO commits; on failure, NSO rolls back changes per transaction semantics.

What just happened: This sequence shows why NSO provides atomic multi-device changes: the CDB + transaction model ensure that either all devices are configured consistently or none are changed, avoiding partial state.

Real-world note: Always test service templates in a lab and enable dry-run/preview features before applying to production devices.

Verify:

# Verification is conceptual:
# Check CDB for staged changes (example RESTCONF call to config root)
curl -s -k -u admin:Lab@123 https://lab.nhprep.com/restconf/data/tailf-ncs:devices | python3 -m json.tool

# Expected: configuration fragments exist for the service in CDB before commit; after commit they are pushed to devices and appear in operational state

Verification Checklist

  • Check 1: NSO server can reach devices — verify with ping from NSO: ping -c 3 192.0.2.11 (expected 0% packet loss).
  • Check 2: NSO RESTCONF responds to API requests — verify with Python script: python3 get_nso_version.py --host lab.nhprep.com --user admin --password Lab@123 (expected JSON with version).
  • Check 3: NSO can execute CLI commands on a device via NED — verify with python3 get_device_version.py --device XR-R1 (expected device show version output).
  • Check 4: Devices appear in NSO inventory — verify with: curl -s -k -u admin:Lab@123 https://lab.nhprep.com/restconf/data/devices (expected JSON listing XR-R1 and XR-R2).

Common Mistakes

SymptomCauseFix
NSO cannot reach device (connection timed out)Wrong management IP or firewall blocking SSH/NETCONF/RESTCONFVerify mgmt IPs, open required ports (SSH/NETCONF/HTTPS) on management network, check routing
RESTCONF curl returns 401 UnauthorizedWrong credentials used for APIUse correct username and password (example password: Lab@123). Ensure authgroup in NSO maps to correct credentials
Script returns empty device listDevices not added/registered in NSO inventoryAdd device configuration to NSO inventory, ensure NED type is correct and management IP is reachable
Execution of CLI commands returns unexpected outputInconsistent prompt/CLI differences across device models (CLI NED parsing)Use device-specific NED or adjust parsing rules in the NED; prefer NETCONF/YANG where available

Key Takeaways

  • The CDB is NSO’s transactional source of truth; it enables atomic multi-device changes and rollback on error — essential for safe orchestration in production.
  • Service Manager maps high-level service intent to device-level configuration fragments; this is how NSO provides service abstraction across vendors.
  • Device Manager & NEDs translate NSO’s device-agnostic config into vendor-specific operations using NETCONF/RESTCONF or CLI. Use NETCONF/YANG where possible for structured data and better reliability.
  • Always validate management connectivity, device inventory, and the northbound API before performing service changes; small network or credential issues are the most common operational problems.

Final tip: In production, build automation with idempotency, error handling, and logging. Use dry-run and staging environments to validate service templates and NED behavior before rolling out to live devices.


This lesson introduced NSO’s architecture and demonstrated how to validate basic connectivity and API access. The next lesson will walk through creating a simple service template and observing how NSO maps that to device configs and the CDB.