Lesson 6 of 7

Ansible Collections and Galaxy

Objective

Introduction
In this lesson we will learn how to install and use Ansible Collections from Ansible Galaxy to manage Cisco network devices. We will install Cisco collections (cisco.nxos, cisco.ios, cisco.asa), prepare an inventory that targets spine and leaf switches, and author a simple overlay task using the FQCN (fully qualified collection name) module cisco.nxos.nxos_vlans to create VLAN-to-VNI mappings. This matters in production because collections package vendor-supported modules and plugins so you can reliably automate device configuration at scale — for example, deploying consistent VLAN/VNI mappings across multiple leaf switches in a VXLAN EVPN fabric.

Quick real-world scenario: In a data center migration you must push the same overlay VLAN and VNI configuration to dozens of leaf switches. Using a vendor collection from Galaxy ensures you use supported module semantics and reduces manual CLI errors.

Topology

Quick Recap — same topology used in lesson 1 (no new devices/IPs added).

ASCII topology (management addresses shown on each device):

                 +----------------+    +----------------+
                 |   Spine-1      |    |   Spine-2      |
                 | mgmt:10.15.1.11|    | mgmt:10.15.1.12|
                 +-------+--------+    +-------+--------+
                         \                  /
                          \                /
                           \              /
            +---------------+--------------+---------------+
            |               |              |               |
+----------------+  +----------------+  +----------------+  (leafs)
|   Leaf-1       |  |   Leaf-2       |  |   Leaf-3       |
| mgmt:10.15.1.13|  | mgmt:10.15.1.14|  | mgmt:10.15.1.15|
+----------------+  +----------------+  +----------------+

Tip: Management IPs in the topology are the addresses Ansible will use to connect to each device.

Device Table

Device RoleHostname ExampleManagement IP
SpineSpine-110.15.1.11
SpineSpine-210.15.1.12
LeafLeaf-110.15.1.13
LeafLeaf-210.15.1.14
LeafLeaf-310.15.1.15

Key Concepts

  • Ansible Collections: Bundles of modules, plugins, and roles published on Galaxy. Use collections to get vendor-maintained modules (for example, cisco.nxos). Think of a collection like an app store package: it groups the code and documentation you need to manage a platform.
  • FQCN (Fully Qualified Collection Name): Always call modules using their FQCN (e.g., cisco.nxos.nxos_vlans). This prevents ambiguity when different collections provide modules with the same name.
  • Connection plugin (network_cli / ansible.netcommon.network_cli): Ansible uses a connection plugin to open a persistent CLI session over SSH to network devices. The inventory variable ansible_connection (or ansible.netcommon.network_cli) tells Ansible which transport to use.
  • Declarative module behavior: Network modules accept a desired state (e.g., state: merged) — the module computes the necessary CLI and applies changes. In production, this reduces drift because the module enforces intent rather than issuing raw commands.
  • Role separation (underlay vs overlay): In network automation, separate underlay (interfaces, routing) and overlay (VLANs, VNIs, VRFs) into roles. This mirrors real operational separation and allows staged deployments.

Steps

Step 1: Install Ansible and the Cisco collections from Galaxy

What we are doing: Install Ansible and fetch the Cisco collections that provide the vendor modules we will use (cisco.nxos, cisco.ios, cisco.asa). Installing collections from Galaxy guarantees you are using packaged, documented modules maintained by vendors or community.

pip install ansible
pip install ansible-core
ansible-galaxy collection install cisco.nxos cisco.ios cisco.asa cisco.dcnm

What just happened:

  • pip install ansible and pip install ansible-core install the Ansible runtime and libraries on the control host. Ansible-core provides the engine; the pip-installed Ansible package adds conveniences.
  • ansible-galaxy collection install ... downloads the named collections from Galaxy and places them in your local collections path so Ansible can import FQCN modules from them.

Real-world note: In production you often pin collection versions and install them from an internal Galaxy mirror for reproducible runs and to meet change-control requirements.

Verify:

ansible-galaxy collection list

Expected output (example):

# /home/ansible/.ansible/collections/ansible_collections
Collection        Version
---------------------------
cisco.nxos        2.0.0
cisco.ios         2.0.0
cisco.asa         1.1.0
cisco.dcnm        1.0.0

Step 2: Create the inventory file with connection variables

What we are doing: Create an inventory YAML that defines the control-node connection method, credentials, and lists the spine and leaf management IPs. This tells Ansible where to run tasks and how to connect.

# inventory.yml
all:
  vars:
    ansible_connection: ansible.netcommon.network_cli
    ansible_user: "nxos_username"
    ansible_password: "Lab@123"
    ansible_network_os: cisco.nxos.nxos
  children:
    spines:
      hosts:
        10.15.1.11:
        10.15.1.12:
    leafs:
      hosts:
        10.15.1.13:
        10.15.1.14:
        10.15.1.15:

What just happened:

  • ansible_connection selects the network CLI plugin so Ansible opens persistent SSH sessions to devices.
  • ansible_user and ansible_password supply credentials (in this lab password is Lab@123). In production use vault/encrypted secrets rather than plaintext.
  • ansible_network_os tells Ansible which platform-specific collection to use when working with modules.

Real-world note: Using group-level variables (spines, leafs) keeps credentials and connection details consistent across many devices and simplifies role targeting.

Verify:

cat inventory.yml

Expected output:

all:
  vars:
    ansible_connection: ansible.netcommon.network_cli
    ansible_user: "nxos_username"
    ansible_password: "Lab@123"
    ansible_network_os: cisco.nxos.nxos
  children:
    spines:
      hosts:
        10.15.1.11:
        10.15.1.12:
    leafs:
      hosts:
        10.15.1.13:
        10.15.1.14:
        10.15.1.15:

Step 3: Initialize roles skeleton and inspect directory layout

What we are doing: Create a role skeleton for the overlay work and inspect the generated files. Roles encapsulate tasks, variables, templates and are the recommended structure for re-usable automation.

ansible-galaxy init overlay
tree overlay

What just happened:

  • ansible-galaxy init overlay generates the standard role directory structure (tasks, defaults, vars, templates). Roles make it easier to separate "what to configure" from "how to run it."
  • tree overlay lists the created files so you can edit tasks/main.yml and put overlay automation there.

Real-world note: Teams often version-control roles and use them in multiple playbooks; the role skeleton enforces consistent organization.

Verify:

tree overlay

Expected output:

overlay
├── README.md
├── tasks
│   └── main.yml
├── templates
└── vars
    └── main.yml

Step 4: Create an overlay task using the FQCN cisco.nxos.nxos_vlans

What we are doing: Author the overlay role task that configures VLAN-to-VNI mappings on leaf switches using the vendor module cisco.nxos.nxos_vlans. Using the module FQCN ensures we call the correct implementation from the installed collection.

# overlay/tasks/main.yml
---
- name: Configure VLAN-to-VNI Mappings
  cisco.nxos.nxos_vlans:
    config:
      - name: Web Servers
        vlan_id: 101
        mapped_vni: 10101
      - name: DB Servers
        vlan_id: 102
        mapped_vni: 10102
      - name: vMotion
        vlan_id: 103
        mapped_vni: 10103
    state: merged

What just happened:

  • The task calls cisco.nxos.nxos_vlans (FQCN) and provides a config list with VLAN names, IDs, and mapped VNIs.
  • state: merged tells the module to ensure these VLAN entries exist without removing unrelated configuration. The module will translate intent into the NX-OS CLI necessary to create or update VLANs and associated VNI mappings.

Real-world note: Use state: merged when you want to append or update configuration safely; use state: replaced only when you want to enforce an exact set.

Verify:

  1. Run the playbook targeting leafs (example playbook shown below):
# main.yml (playbook)
---
- hosts: spines, leafs
  gather_facts: false
  roles:
    - role: common
    - role: underlay

- hosts: leafs
  gather_facts: false
  roles:
    - role: overlay

Run:

ansible-playbook -i inventory.yml main.yml

Expected result snippet (Ansible run summary):

PLAY [hosts] *******************************************************************

TASK [overlay : Configure VLAN-to-VNI Mappings] ********************************
ok: [10.15.1.13]
ok: [10.15.1.14]
ok: [10.15.1.15]

PLAY RECAP *********************************************************************
10.15.1.13                : ok=1    changed=0    unreachable=0    failed=0
10.15.1.14                : ok=1    changed=0    unreachable=0    failed=0
10.15.1.15                : ok=1    changed=0    unreachable=0    failed=0
  1. Verify on a leaf switch using NX-OS CLI to confirm VLANs were created:
show vlan brief

Expected output (example):

VLAN Name                             Status    Ports
---- -------------------------------- --------- -------------------------------
1    default                          active    -
101  Web Servers                      active    -
102  DB Servers                       active    -
103  vMotion                          active    -

Real-world note: After applying overlay tasks, check related overlay state (NVE, EVPN routes) before declaring the change fully operational. Automating verification steps (post-change tests) is critical in production.

Verification Checklist

  • Check 1: Collections installed — run ansible-galaxy collection list and confirm cisco.nxos, cisco.ios, cisco.asa are listed.
  • Check 2: Inventory correctness — run cat inventory.yml and ensure spine and leaf IPs (10.15.1.11..15) are present and ansible_connection is set to ansible.netcommon.network_cli.
  • Check 3: Role skeleton — run tree overlay and confirm tasks/main.yml exists.
  • Check 4: VLANs deployed — run show vlan brief on a leaf and confirm VLANs 101, 102, 103 exist with the expected names.

Common Mistakes

SymptomCauseFix
ansible-galaxy collection not found errorCollection not installed or wrong collection nameRe-run ansible-galaxy collection install <collection> with correct name (e.g., cisco.nxos)
Playbook fails to connect to devicesWrong connection plugin or credentialsEnsure ansible_connection: ansible.netcommon.network_cli, ansible_user and ansible_password: "Lab@123" are correct; use vault for secrets in production
Modules not found when using short names (e.g., nxos_vlans)Not using FQCN and multiple collections provide similar module namesUse the FQCN like cisco.nxos.nxos_vlans to avoid ambiguity
VLANs created but VNIs not mapped or NVE shows no VNIOverlay tasks ran only on spines or role targeting wrong groupVerify playbook hosts target leafs and check role/task placement; ensure state is correct (merged vs replaced)

Key Takeaways

  • Use Ansible Galaxy collections (cisco.nxos, cisco.ios, cisco.asa) to access vendor-supported modules — this improves reliability and maintainability in production.
  • Always call network modules by their FQCN to avoid ambiguity and ensure the correct collection provides the behavior you expect.
  • Inventory variables (ansible_connection, ansible_user, ansible_password, ansible_network_os) are critical — they define how Ansible connects and which network platform modules to use.
  • Structure automation with roles (common, underlay, overlay) so that you can deploy changes predictably and reuse components across environments.

Warning: Never store plaintext production credentials in inventory files; use Ansible Vault or a secrets manager in real networks.