Roles and Reusable Automation
Objective
In this lesson you will learn how to organize Ansible automation using roles so your network provisioning becomes modular, reusable, and maintainable. You will create an inventory, initialize role skeletons, author a reusable overlay role that configures VLAN-to-VNI mappings on NX‑OS leaf switches, and run the playbook against the lab devices. This matters in production because consistent, repeatable device provisioning (idempotent automation) reduces human error and shortens deployment windows when you scale to many devices.
Topology
ASCII diagram showing management IPs used by Ansible for each NX‑OS device (these are the exact addresses used in the lab inventory).
+----------------------+ +----------------------+
| Spine-1 | | Spine-2 |
| Management: 10.15.1.11 | | Management: 10.15.1.12 |
+----------------------+ +----------------------+
| |
| |
+-----------------------------------------------+
| Leaf Fabric |
+-----------------------------------------------+
| | |
+----------------+ +----------------+ +----------------+
| Leaf-1 | | Leaf-2 | | Leaf-3 |
| Mgmt: 10.15.1.13| | Mgmt: 10.15.1.14| | Mgmt: 10.15.1.15|
+----------------+ +----------------+ +----------------+
Device Table
| Device Role | Hostname (logical) | Management IP |
|---|---|---|
| Spine 1 | spine1 | 10.15.1.11 |
| Spine 2 | spine2 | 10.15.1.12 |
| Leaf 1 | leaf1 | 10.15.1.13 |
| Leaf 2 | leaf2 | 10.15.1.14 |
| Leaf 3 | leaf3 | 10.15.1.15 |
Introduction
In this lesson we focus on Roles and Reusable Automation with Ansible for NX‑OS. We'll create a role called overlay that encapsulates VLAN-to-VNI configuration so the same role can be applied to multiple leaf switches consistently. In production, teams use roles to separate concerns — e.g., underlay (routing), overlay (VXLAN/VLAN mapping), and common (NTP, logging) — so changes are localized and easier to test and review. A real-world scenario: when a data center team needs to onboard a new tenant network, a single role can create the VLANs and map them to VXLAN VNIs across all leafs with one invocation.
Quick Recap
We use the same fabric topology introduced earlier (two spines, three leafs). The Ansible control node runs playbooks against the management IPs listed above. Inventory groups are spines and leafs. We will not add new devices in this lesson.
Key Concepts (theory before hands-on)
- Roles: a structured way to package tasks, templates, and variables. Think of a role as a folder that contains everything needed to accomplish one function (like "overlay"). This makes automation reusable across projects.
- Idempotence: Ansible modules (and network modules) are designed to be idempotent — running the same role multiple times converges the device to the intended state without making repeated changes.
- Connection model (network_cli): With NX‑OS we commonly use ansible.netcommon.network_cli (CLI over SSH) which uses a persistent connection to push config; this reduces overhead and supports network-specific modules.
When Ansible runs against a network device using network_cli, it opens a persistent SSH session and sends configuration in structured blocks — not one command at a time like an interactive user.
- Separation of intent and execution: Inventory and group_vars hold the intent (which devices and what variables). Roles hold how to implement that intent (tasks/templates).
- Module behavior (cisco.nxos.nxos_vlans): This module accepts a list of VLAN definitions and will create/merge them on the device; when applied on leafs it will configure VLAN IDs and mapped VNIs (overlay intent).
Step-by-step configuration
Step 1: Prepare a Python environment for Ansible
What we are doing: Create a dedicated Python virtual environment so Ansible and its collections are installed in an isolated, reproducible location. Running automation in a virtualenv avoids conflicts with system Python and makes CI pipelines deterministic.
# Install a specific Python version (example using pyenv)
pyenv install 3.9.11
# Create a virtualenv named 'ansible' using that Python
pyenv virtualenv 3.9.11 ansible
# Create a working directory for Ansible development
mkdir my_ansible_dir
# Tell pyenv to use the 'ansible' virtualenv for this directory
cd my_ansible_dir
pyenv local ansible
What just happened: pyenv install fetches and installs Python 3.9.11. pyenv virtualenv creates a separate virtual environment named ansible. pyenv local ansible writes a file that forces this directory to use that virtualenv, ensuring subsequent pip install and ansible commands run in the isolated environment. This avoids library/version conflicts and matches the reference guidance.
Real-world note: In production pipelines you’ll pin both Python and Ansible versions so CI runs reliably each time.
Verify:
pyenv versions
# Expected output:
# system
# * 3.9.11/envs/ansible (set by /path/to/my_ansible_dir/.python-version)
# 3.9.11
Step 2: Initialize the role skeleton for overlay
What we are doing: Use the Ansible Galaxy role skeleton generator to create the directory structure for the overlay role. This provides canonical locations for tasks/, templates/, and vars/ so we and others can find role artifacts quickly.
ansible-galaxy init overlay
What just happened: ansible-galaxy init overlay creates a role folder named overlay with subfolders tasks, templates, vars, and placeholder README. This standard layout makes roles predictable and reusable.
Real-world note: Teams commit roles to source control (git) so they can be reviewed and used in CI/CD pipelines.
Verify:
tree overlay
# Expected output (exact structure may include README.md):
# overlay
# ├── README.md
# ├── tasks
# │ └── main.yml
# ├── templates
# └── vars
# └── main.yml
Step 3: Create the Ansible inventory and group variables
What we are doing: Define the device groups, connection settings, and credentials in inventory.yml. This file tells Ansible which devices belong to spines and leafs, which connection plugin to use, and the username/password to log in. Centralizing credentials and connection type in the inventory ensures all playbooks use consistent connectivity.
# Create inventory.yml with the following content
# (exact host IPs from the lab)
all:
vars:
ansible_connection: network_cli
ansible_user: "nxos_username"
ansible_password: "Lab@123"
ansible_network_os: cisco.nxos.nxos
children:
spines:
hosts:
10.15.1.11:
10.15.1.12:
leafs:
hosts:
10.15.1.13:
10.15.1.14:
10.15.1.15:
What just happened: The inventory.yml maps management IPs to logical groups and sets ansible_connection to network_cli so Ansible uses SSH in a network-aware way. ansible_network_os signals Ansible to use NX‑OS-specific modules and behaviors.
Real-world note: In production you would protect credentials via an Ansible Vault or a secrets manager; the plain text password here is for lab demonstration only.
Verify:
# Show inventory file
cat inventory.yml
# Expected output: the YAML content exactly as created above
# Test connectivity (module: ping may use network-specific behavior)
ansible all -i inventory.yml -m ping
# Expected output:
# 10.15.1.11 | SUCCESS => {"changed": false, "ping": "pong"}
# 10.15.1.12 | SUCCESS => {"changed": false, "ping": "pong"}
# 10.15.1.13 | SUCCESS => {"changed": false, "ping": "pong"}
# 10.15.1.14 | SUCCESS => {"changed": false, "ping": "pong"}
# 10.15.1.15 | SUCCESS => {"changed": false, "ping": "pong"}
Step 4: Author the overlay role task to configure VLAN-to-VNI mappings
What we are doing: Implement the role task that configures VLANs and mapped VNIs using the cisco.nxos.nxos_vlans module. This encapsulates overlay intent so it can be reused against any leaf group.
# Create roles/overlay/tasks/main.yml with the following content
- name: Configure VLAN-to-VNI Mappings
cisco.nxos.nxos_vlans:
config:
- name: Web Servers
vlan_id: 101
mapped_vni: 10101
- name: DB Servers
vlan_id: 102
mapped_vni: 10102
- name: vMotion
vlan_id: 103
mapped_vni: 10103
state: merged
What just happened: The task declares a list of VLAN dictionaries under config:. The cisco.nxos.nxos_vlans module will ensure VLANs 101/102/103 exist and are associated with the specified mapped VNIs. Using state: merged means Ansible will add or update the VLAN entries, leaving other VLANs intact — aligning with idempotent behavior.
Real-world note: Packaging VLAN-to-VNI as a role lets you reuse identical overlay logic across different sites by changing only variables, not the task code.
Verify:
# Syntax check the playbook that will use the role (example main playbook 'site.yml')
ansible-playbook -i inventory.yml site.yml --syntax-check
# Expected output:
# playbook: site.yml
# (No syntax errors detected.)
Step 5: Build the main playbook and run it against the leaf group
What we are doing: Create a top-level playbook that applies common, underlay, and overlay roles to appropriate groups. For this lesson we'll run only the overlay role against leafs to demonstrate reuse.
# site.yml (main playbook)
- hosts: spines, leafs
gather_facts: false
roles:
- role: common
- role: underlay
- hosts: leafs
gather_facts: false
roles:
- role: overlay
# Run the playbook against the lab inventory
ansible-playbook -i inventory.yml site.yml
What just happened: The playbook defines two plays. The first play would normally run common and underlay on both spines and leafs; the second play runs overlay specifically on leafs. Running the playbook applies the tasks from the overlay role on each leaf, creating/merging VLANs as defined. Under the hood, Ansible opens persistent network_cli SSH sessions to each device, executes the necessary NX‑OS configuration commands, and then closes the session after the play completes.
Real-world note: In production you commonly run these playbooks from CI/CD pipelines where a merge request triggers syntax checks and a staged apply to a test group before production.
Verify:
# On a leaf (example 10.15.1.13) check VLAN configuration using NX-OS CLI
# (This is a device CLI verification; run on the device or via an SSH session)
show vlan brief
# Expected output (full output showing VLANs 101-103 present):
# VLAN Name Status Ports
# ---- -------------------------------- --------- -------------------------------
# 1 default active
# 101 Web Servers active
# 102 DB Servers active
# 103 vMotion active
# <additional VLANs may be present>
# Additionally confirm NX-OS mapped VNI (example command; if your platform supports NV overlay show)
show nv overlay evpn vni
# Expected sample output (if the device supports NV overlay show):
# VNI VLAN State
# 10101 101 up
# 10102 102 up
# 10103 103 up
Verification Checklist
- Check 1: Inventory is valid and all hosts respond — run
ansible all -i inventory.yml -m pingand expectSUCCESSfrom each IP. - Check 2: Role structure exists —
tree overlayshould showtasks/main.ymlandvars/main.yml. - Check 3: VLANs were created on all leafs — run
show vlan briefon each leaf (10.15.1.13, .14, .15) and confirm VLANs 101, 102, and 103 exist. - Check 4: VNIs are associated (if platform supports) — run
show nv overlay evpn vniand confirm VNIs 10101, 10102, 10103 mapped to VLANs 101–103.
Common Mistakes
| Symptom | Cause | Fix |
|---|---|---|
| Ansible tasks fail with authentication errors | Wrong username or password in inventory (ansible_user/ansible_password) | Ensure credentials are correct; for lab use ansible_password: "Lab@123" or use Vault in production |
| Playbook cannot connect (timeout) | ansible_connection not set to network_cli or firewall blocking SSH | Set ansible_connection: network_cli in inventory and verify SSH connectivity to the device IPs |
| Role not found when running playbook | Role directory not in playbook search path or role name mismatch | Ensure a folder named overlay exists in the roles/ path and playbook references - role: overlay |
| VLANs not configured or removed unexpectedly | Using state: replaced instead of state: merged (or not using module parameters correctly) | Use state: merged to add/update VLANs without removing others; double-check module parameters |
Key Takeaways
- Use roles to make network automation modular and reusable; a role encapsulates tasks, variables, and templates for one functional area (e.g., overlay).
- Inventory and group variables express intent (which devices, credentials, and connection type); roles express implementation.
- For NX‑OS,
ansible.netcommon.network_cliis the recommended connection plugin (persistent SSH) so network modules behave predictably and efficiently. - Always verify automation results on the devices (for example
show vlan briefand NV overlay VNI show commands) — automation should be validated just like manual config.
Tip: Treat roles as libraries — make them small and focused. When a role grows too many responsibilities, split it into multiple roles (e.g.,
overlay-vlan,overlay-bgp) so reuse and testing become simpler.
Next lesson will cover testing and CI validation of these roles (syntax checks, linting, and staged deployments).