NHPREP — Hands-On CCNA, CCNP, CCIE Labs & Networking Courses

Objective

This lesson teaches you how to build and run multi-device Python automation scripts that configure several IOS XE devices concurrently. You will learn how to structure an inventory, use threading (concurrency) to push configuration to multiple devices, add robust error handling and retries, and implement logging so you can audit what happened. In production, these techniques let you provision hundreds or thousands of switches consistently and quickly — reducing human error and mean time to deploy.

Real-world scenario: An operations team must deploy a standardized management VLAN and NTP configuration to dozens of access switches after a maintenance window. Using a multi-device Python script with concurrency and logging turns a hours-long manual job into a repeatable, auditable task.

Quick Recap

Refer to the topology used in Lesson 1 for connectivity. This lesson does not add new devices or new IP addresses; it operates against the same IOS XE devices from Lesson 1. Example hostnames used in scripts are:

sw1.lab.nhprep.com
sw2.lab.nhprep.com
sw3.lab.nhprep.com

(We use hostnames in examples so you can substitute the corresponding management IPs from your lab inventory.)

Topology (reference)

Use the same physical/logical topology from Lesson 1. This lesson targets the management plane of the devices (their management IPs reachable from your automation workstation).

Device Table

Device	Role	Management FQDN
sw1	Access switch	sw1.lab.nhprep.com
sw2	Access switch	sw2.lab.nhprep.com
sw3	Access switch	sw3.lab.nhprep.com

Key Concepts — theory before hands-on

Concurrency (Threading): Running multiple configuration operations concurrently reduces total elapsed time. Python threads allow parallel network I/O-bound tasks; in production, this means faster mass changes. Note: network I/O is typically the bottleneck, so threads yield large gains.
Idempotence & Templates: Use configuration templates so repeated runs cause no adverse effects. Think of templates like a recipe — the same ingredients produce the same cake.
Error handling & retries: Network devices can fail transiently. Implement retries with exponential backoff and capture device error responses to avoid partial deployments.
Logging & Auditing: Detailed logs (timestamp, device, action, result) are essential for post-change troubleshooting and compliance. In production, logs feed ticketing and change management.
Verification loops: Always verify the change with explicit show commands after pushing config — automation must include verification to be trustworthy.

Step-by-step configuration

Step 1: Create an inventory and configuration template

What we are doing: Define which devices to target and what configuration to apply. A structured inventory (YAML or JSON) and a Jinja-like template keep the script generic and reusable.

! No device CLI commands for this step. The following are shell/Python files created on the automation host.

Create inventory.json:

{
  "devices": [
    {"host": "sw1.lab.nhprep.com", "username": "admin", "password": "Lab@123"},
    {"host": "sw2.lab.nhprep.com", "username": "admin", "password": "Lab@123"},
    {"host": "sw3.lab.nhprep.com", "username": "admin", "password": "Lab@123"}
  ]
}

Create mgmt_template.txt (simple config to ensure management VLAN and NTP):

hostname {{ hostname }}
management vlan 10
interface Vlan10
 ip address {{ mgmt_ip }} 255.255.255.0
!
ntp server 198.18.1.1

What just happened: You created a device list with credentials and a configuration template. The template variables (hostname, mgmt_ip) let one template serve many devices. This matters because templates ensure consistency across multiple devices and support idempotence when the automation engine renders them.

Real-world note: Never store plaintext credentials in production inventory. Use vaults or secure credential stores. Here we use plain text for lab simplicity.

Verify:

! On the automation workstation:
cat inventory.json

Expected output:

{
  "devices": [
    {"host": "sw1.lab.nhprep.com", "username": "admin", "password": "Lab@123"},
    {"host": "sw2.lab.nhprep.com", "username": "admin", "password": "Lab@123"},
    {"host": "sw3.lab.nhprep.com", "username": "admin", "password": "Lab@123"}
  ]
}

Step 2: Build a simple single-threaded push script (baseline)

What we are doing: Create a baseline Python script that reads inventory, renders the template per device, and pushes config serially using RESTCONF (HTTP POST/PUT). A baseline script establishes functionality before adding concurrency.

! No device CLI commands for this step.

Create push_baseline.py:

#!/usr/bin/env python3
import json
import requests
from string import Template

with open('inventory.json') as f:
    inv = json.load(f)

template = Template(open('mgmt_template.txt').read())

for dev in inv['devices']:
    payload = template.substitute(hostname=dev['host'].split('.')[0],
                                  mgmt_ip='192.0.2.10')  # lab value; replace with real mgmt IP
    url = f"https://{dev['host']}/restconf/data/Cisco-IOS-XE-native:native"
    r = requests.put(url, data=payload, auth=(dev['username'], dev['password']),
                     headers={'Content-Type': 'application/yang-data+json'}, verify=False)
    print(dev['host'], r.status_code, r.text)

What just happened: The script reads inventory, fills the template, and issues RESTCONF PUT to replace config under the native YANG path. The HTTP response status indicates success (200/201) or failure. Starting with a single-threaded approach simplifies debugging and ensures the RESTCONF path is correct.

Real-world note: RESTCONF endpoints and YANG paths must match the device model. Test on a single device before scaling.

Verify:

! On automation host, run:
python3 push_baseline.py

Expected output (example successful run):

sw1.lab.nhprep.com 200 {"message":"Configuration applied"}
sw2.lab.nhprep.com 200 {"message":"Configuration applied"}
sw3.lab.nhprep.com 200 {"message":"Configuration applied"}

Step 3: Add threading for concurrency and timeouts

What we are doing: Convert the baseline into a concurrent script using ThreadPoolExecutor to push to devices in parallel, and set per-request timeouts to avoid hung threads.

! No device CLI commands for this step.

Create push_concurrent.py:

#!/usr/bin/env python3
import json
import requests
from string import Template
from concurrent.futures import ThreadPoolExecutor, as_completed

with open('inventory.json') as f:
    inv = json.load(f)

template = Template(open('mgmt_template.txt').read())

def push_config(dev):
    payload = template.substitute(hostname=dev['host'].split('.')[0],
                                  mgmt_ip='192.0.2.10')
    url = f"https://{dev['host']}/restconf/data/Cisco-IOS-XE-native:native"
    try:
        r = requests.put(url, data=payload, auth=(dev['username'], dev['password']),
                         headers={'Content-Type': 'application/yang-data+json'},
                         timeout=10, verify=False)
        return (dev['host'], r.status_code, r.text)
    except requests.exceptions.RequestException as e:
        return (dev['host'], 'ERROR', str(e))

with ThreadPoolExecutor(max_workers=5) as ex:
    futures = [ex.submit(push_config, d) for d in inv['devices']]
    for fut in as_completed(futures):
        print(fut.result())

What just happened: You introduced concurrency via a thread pool, so multiple devices are configured in parallel. Each request has a 10-second timeout to prevent a blocked thread. This design reduces overall run time and prevents the script from stalling on an unreachable device.

Real-world note: Choose max_workers based on your workstation and network device CPU to avoid overwhelming devices or the network control plane.

Verify:

! On automation host:
python3 push_concurrent.py

Expected output (example):

('sw2.lab.nhprep.com', 200, '{"message":"Configuration applied"}')
('sw1.lab.nhprep.com', 200, '{"message":"Configuration applied"}')
('sw3.lab.nhprep.com', 200, '{"message":"Configuration applied"}')

Step 4: Implement error handling, retries, and structured logging

What we are doing: Enhance reliability by retrying transient failures and writing structured logs (timestamp, device, outcome) to a file. This gives you both robustness and an audit trail.

! No device CLI commands for this step.

Create push_robust.py:

#!/usr/bin/env python3
import json, time, logging
import requests
from string import Template
from concurrent.futures import ThreadPoolExecutor, as_completed

logging.basicConfig(filename='push.log', level=logging.INFO,
                    format='%(asctime)s %(levelname)s %(message)s')

with open('inventory.json') as f:
    inv = json.load(f)

template = Template(open('mgmt_template.txt').read())

def push_config_with_retries(dev, retries=3, backoff=2):
    attempt = 0
    while attempt < retries:
        attempt += 1
        payload = template.substitute(hostname=dev['host'].split('.')[0],
                                      mgmt_ip='192.0.2.10')
        url = f"https://{dev['host']}/restconf/data/Cisco-IOS-XE-native:native"
        try:
            r = requests.put(url, data=payload, auth=(dev['username'], dev['password']),
                             headers={'Content-Type': 'application/yang-data+json'},
                             timeout=10, verify=False)
            if r.status_code in (200, 201, 204):
                logging.info(f"{dev['host']} SUCCESS status={r.status_code}")
                return (dev['host'], 'SUCCESS', r.status_code, r.text)
            else:
                logging.warning(f"{dev['host']} BAD_STATUS status={r.status_code} body={r.text}")
                # Consider retrying on 5xx
        except requests.exceptions.RequestException as e:
            logging.error(f"{dev['host']} EXC {e}")
        time.sleep(backoff ** attempt)
    logging.error(f"{dev['host']} FAILED after {retries} attempts")
    return (dev['host'], 'FAILED', None, None)

with ThreadPoolExecutor(max_workers=5) as ex:
    futures = [ex.submit(push_config_with_retries, d) for d in inv['devices']]
    for fut in as_completed(futures):
        print(fut.result())

What just happened: The script now retries transient errors with exponential backoff, logs all important events to push.log, and returns structured tuples for easy downstream processing. Logs are crucial for troubleshooting and change auditing.

Real-world note: Logging to centralized systems (syslog, ELK, or SIEM) is recommended for enterprise visibility and compliance.

Verify:

! Run the script:
python3 push_robust.py

! Inspect the log:
cat push.log

Expected push_robust.py stdout:

('sw1.lab.nhprep.com', 'SUCCESS', 200, '{"message":"Configuration applied"}')
('sw2.lab.nhprep.com', 'SUCCESS', 200, '{"message":"Configuration applied"}')
('sw3.lab.nhprep.com', 'SUCCESS', 200, '{"message":"Configuration applied"}')

Expected push.log entries:

2026-04-02 12:00:01,234 INFO sw1.lab.nhprep.com SUCCESS status=200
2026-04-02 12:00:01,567 INFO sw2.lab.nhprep.com SUCCESS status=200
2026-04-02 12:00:02,001 INFO sw3.lab.nhprep.com SUCCESS status=200

Step 5: Verify device configuration with show commands and collect output

What we are doing: After pushing config, run verification show commands on each device to confirm the desired state. This script uses SSH (or a RESTCONF GET) to pull verification output and logs it for auditing.

! Example verification via device CLI (SSH). The following are commands you run on the network devices to verify configuration.

Verification show commands (run on each device or via automation that executes them):

show running-config | include hostname
hostname sw1

show running-config | section interface Vlan10
interface Vlan10
 ip address 192.0.2.10 255.255.255.0

show ntp status
Clock is synchronized, stratum 2, reference is 198.18.1.1

What just happened: These show commands confirm the hostname, management VLAN interface, and NTP status. The outputs demonstrate that the template rendered correctly and the devices are synchronized to the designated NTP server.

Real-world note: Verification must be part of your automation pipeline. Automated rollbacks or alerting should trigger if verification fails.

Verify (automation host approach):

! Example automated verification using RESTCONF GET for the hostname (conceptual):
curl -k -u admin:Lab@123 https://sw1.lab.nhprep.com/restconf/data/Cisco-IOS-XE-native:native/hostname

Expected output (conceptual JSON):

{"hostname":"sw1"}

Verification Checklist

Check 1: Configuration templates rendered correctly — verify by checking show running-config | include hostname on each device.
Check 2: Management interface exists with the expected IP — verify with show running-config | section interface Vlan10.
Check 3: NTP is configured and the device is synchronized — verify with show ntp status.

Common Mistakes

Symptom	Cause	Fix
Script times out on a subset of devices	Wrong hostname or DNS not resolving lab hostnames	Use management IPs or ensure DNS resolves hostnames; verify with `ping` from automation host
HTTP/RESTCONF 401 Unauthorized	Wrong credentials in inventory	Update inventory credentials or use a secure credential store; test with one device first
Partial deployment (some devices configured, others not)	No retries and transient network glitches	Add retries/backoff and logging; re-run script with idempotent templates
Logs show "SSL certificate verify failed"	Devices use self-signed certs and verify=True	For lab, use verify=False in requests; in production, manage proper CA-signed certs

Key Takeaways

Use templates and an inventory to make multi-device changes repeatable and idempotent — this prevents configuration drift.
Concurrency (threading) speeds up mass deployments; tune thread counts to match your environment and device capabilities.
Always implement retries, timeouts, and structured logging — this provides resilience and auditability.
Automation must include verification steps; if verification fails, automation should either rollback or alert operators for manual intervention.

Final real-world insight: In production networks, automation is only as good as its safety net — testing, verification, logging, and secure credential handling are non-negotiable. Start small, verify, then scale your multi-device automation.

Multi-Device Automation Scripts

Objective

Quick Recap

Topology (reference)

Device Table

Key Concepts — theory before hands-on

Step-by-step configuration

Step 1: Create an inventory and configuration template

Step 2: Build a simple single-threaded push script (baseline)

Step 3: Add threading for concurrency and timeouts

Step 4: Implement error handling, retries, and structured logging

Step 5: Verify device configuration with show commands and collect output

Verification Checklist

Common Mistakes

Key Takeaways