Back to Blog
Automation22 min read

Cisco Multivendor Telemetry Collection With Crosswork

A
Admin
March 26, 2026
multivendor telemetrynetwork telemetrystreaming telemetrycrosswork data gatewaynetwork automation

Cisco Multivendor Telemetry Collection Solution

Introduction

Modern enterprise networks are rarely built on a single vendor's equipment. Routers from one manufacturer sit alongside switches from another, firewalls from a third, and wireless controllers from yet another. When it comes time to collect multivendor telemetry from all of these devices, engineers face a serious challenge: how do you unify data collection across platforms that speak different protocols, use different data models, and expose different metrics? The answer lies in a purpose-built collection architecture that abstracts the complexity of individual device protocols and presents a single, scalable interface for gathering and distributing network telemetry data.

This article explores how a centralized telemetry collection solution works in multivendor environments. We will cover the architecture and deployment models, walk through every supported collection protocol, explain the anatomy of collection job payloads, break down monitoring and troubleshooting workflows, and examine the powerful "collect once, distribute many" optimization paradigm. Whether you are building a custom automation pipeline or integrating with existing network management applications, this guide provides the technical depth you need to design and operate a production-grade network telemetry collection system.

By the end of this article, you will understand how to create collection jobs via both GUI and API, how to structure payloads for different sensor types, how to monitor collection and distribution health independently, and how to consume collected data in downstream applications using messaging buses like Kafka and gRPC.

What Is Multivendor Telemetry Collection?

At its core, multivendor telemetry collection is the process of gathering operational data from network devices made by different vendors through a single, unified collection layer. Rather than deploying separate collectors for each vendor's equipment or protocol, a centralized solution normalizes the collection workflow so that engineers interact with one set of APIs and one management interface regardless of the underlying device type.

The Problem It Solves

In a typical enterprise network, you might need to collect interface counters from Cisco IOS-XE routers using Model-Driven Telemetry (MDT), pull CPU utilization from third-party switches using SNMP, and gather routing table information from another vendor's equipment using gNMI. Without a unified collection layer, each of these workflows requires its own tooling, its own monitoring, and its own integration with downstream analytics platforms.

A centralized telemetry collection solution addresses this by providing:

  • Protocol abstraction — support for SNMP, MDT, gNMI, CLI, SNMP TRAP, and syslog through a single platform
  • Vendor-agnostic collection — the ability to collect from both Cisco and third-party devices
  • Unified API and GUI — a single interface for creating, managing, and monitoring collection jobs
  • Scalable distribution — collected data flows to downstream applications through standard messaging buses

Beyond Telemetry

Although streaming telemetry was the original use case for this type of collection architecture, its utility extends well beyond real-time metrics gathering. The same collection infrastructure serves three distinct categories of network data needs:

Use Case CategoryDescriptionProtocols Used
Telemetry SystemsReal-time operational metrics and countersgNMI, MDT, SNMP (poll)
Inventory SystemsDevice and component inventory dataSNMP, CLI, gNMI
Monitoring SystemsNetwork event detection and alertingSNMP TRAP, Syslog

This "single collection point" philosophy means that one deployment can satisfy telemetry, inventory, and monitoring requirements simultaneously, reducing operational overhead and eliminating redundant collection infrastructure.

Architecture of the Multivendor Telemetry Collection Solution

The collection architecture is built around a decoupled design where the collection layer operates independently from the application layer. This separation is critical for scalability and resilience.

Core Components

The architecture consists of several key components working together:

  1. Infrastructure Layer — provides the management plane, including the GUI and API interfaces used to configure and monitor collection jobs
  2. Data Gateway — the workhorse of the system, responsible for actually connecting to network devices, collecting data, and forwarding it to destinations
  3. Plugin System — protocol-specific plugins (MDT, SNMP, gNMI, CLI) that handle the details of each collection protocol
  4. Messaging Bus — Kafka or gRPC servers that act as the distribution layer between the Data Gateway and downstream applications
  5. Downstream Applications — customer or vendor applications (such as Grafana, InfluxDB, or custom tools) that consume the collected data

Decoupled Collection and Application Layers

One of the most important architectural decisions is the complete decoupling of the collection layer from the application layer. The Data Gateway operates as a standalone collection entity that can be deployed, scaled, and managed independently of the applications consuming its data. This provides several advantages:

  • Horizontal scaling — additional Data Gateway instances can be deployed to handle increased collection load without modifying the application layer
  • Application offloading — the collection workload is removed from application servers, freeing them to focus on data processing and presentation
  • N+M redundancy — multiple Data Gateway instances provide fault tolerance without requiring application-level redundancy changes

Pro Tip: When planning your deployment, size your Data Gateway instances based on the number of devices and collection cadence, not on the number of downstream applications. The collection side is almost always the bottleneck, not the distribution side.

Lab Topology Example

A practical lab environment for learning multivendor telemetry collection typically includes:

  • The infrastructure management platform with its GUI and API interfaces
  • One or more Data Gateway instances
  • Network devices from multiple vendors
  • A Kafka messaging bus for data distribution
  • Downstream visualization and storage tools such as Grafana, InfluxDB, and Kafka consumers

The target in a lab environment is often to collect interface counters from multiple devices using different protocols (MDT, SNMP, gNMI) and verify that the data arrives correctly at each destination.

What Protocols Does Multivendor Telemetry Support?

The collection solution supports a comprehensive set of protocols, each suited to different device types, data requirements, and operational scenarios.

SNMP (Simple Network Management Protocol)

SNMP remains one of the most widely supported monitoring protocols across network vendors. The collection platform supports both standard and proprietary MIBs for gathering data from any SNMP-capable device. For SNMP polling, sensor data is specified using:

  • A MIB name
  • A MIB table name
  • A MIB scalar variable

SNMP is particularly useful for third-party devices that may not support modern telemetry protocols.

Model-Driven Telemetry (MDT)

MDT provides streaming telemetry capabilities where the network device pushes data to the collector at a configured cadence. MDT supports:

  • OpenConfig YANG data models — vendor-neutral models for common operational data
  • Native YANG data models — vendor-specific models that expose platform-specific counters and features
  • OpenConfig data models with vendor extensions — hybrid models that augment standard schemas with vendor-specific leaves

Sensor data for MDT is specified using an XPath expression that identifies the data tree path within the YANG model.

gNMI (gRPC Network Management Interface)

gNMI is a modern, gRPC-based protocol for network management that supports both streaming telemetry and configuration management. Like MDT, gNMI uses YANG data models and XPath-style paths for sensor specification. The key difference lies in the encoding protocols — gNMI defines multiple encoding options:

EncodingDescription
PROTOProtocol Buffers binary encoding
JSONStandard JSON encoding
JSON_IETFJSON encoding following IETF conventions
ASCIIHuman-readable text encoding
BinaryRaw binary encoding

The actual encoding available depends on the vendor's implementation. Not all devices support all encoding types.

Pro Tip: When collecting from multi-vendor environments, pay close attention to which encodings each device supports. A mismatch between the requested encoding and what the device supports is one of the most common causes of collection failures.

CLI (Command Line Interface)

CLI-based collection allows data to be gathered by executing CLI commands on devices and parsing the output. Sensor data for CLI collection is simply the CLI command to execute. It is important to note that CLI collection support is limited to Cisco devices. For third-party devices, CLI collection would require additional customization.

SNMP TRAP

Unlike SNMP polling, which actively queries devices for data, SNMP TRAP collection listens for unsolicited notifications sent by devices. Sensor data for TRAP collection is specified using a Trap OID (Object Identifier) that identifies the specific trap to listen for.

Syslog

Syslog collection captures log messages sent by network devices. Sensor data for syslog is specified using a combination of:

  • Severity number — indicating the importance level of the message
  • Facility number — indicating the subsystem that generated the message

Protocol Comparison Summary

ProtocolDirectionSensor SpecificationMulti-VendorBest For
SNMP PollCollector pullsMIB name/table/scalarYesLegacy devices, broad compatibility
MDTDevice pushesYANG XPathCisco + select vendorsHigh-frequency Cisco telemetry
gNMIDevice pushes or collector pullsYANG XPathYesModern multi-vendor streaming
CLICollector pullsCLI commandCisco onlyData not available via other protocols
SNMP TRAPDevice pushesTrap OIDYesEvent-driven monitoring
SyslogDevice pushesSeverity + FacilityYesLog aggregation and alerting

Deployment Models for Multivendor Telemetry

The collection solution supports multiple deployment models to accommodate different organizational requirements and infrastructure architectures.

On-Premises Deployment

In an on-premises deployment, the entire collection infrastructure runs within the customer's data center. This model is suited for organizations that need to keep all collected telemetry data within their own network boundaries. On-premises deployments support:

  • One or more Data Gateway instances collecting from multi-vendor devices
  • Software offloading through dynamic Data Gateway deployment
  • Multi-vendor enablement across the device fleet
  • Multiple instances for large-scale environments

Cloud Deployment

Cloud-based deployments enable secure data collection and distribution to cloud-hosted applications. In this model, Data Gateway instances may run on-premises but forward collected data to cloud-based analytics and management platforms. The Data Gateway provides secure data collection and distribution to cloud applications.

Customer Application Integration

For organizations building custom applications, the Data Gateway provides data collection capabilities that support both Cisco and third-party devices. This deployment model is particularly valuable for:

  • Custom SNMP and telemetry data collection
  • Third-party device data collection testing in customer labs
  • Integration with proprietary analytics and automation pipelines

Pro Tip: When evaluating deployment models, consider starting with a lab environment that mirrors your production topology. Use the lab to validate collection from all device types and protocols before deploying to production.

How to Create Collection Jobs for Multivendor Telemetry

Collection jobs are the fundamental unit of work in the telemetry collection system. Each job defines what data to collect, from which devices, using which protocol, and where to send the results. Jobs can be created through three methods: the GUI, the API (using tools like Postman), or through automation tools.

Creating Jobs via the GUI

The graphical interface provides a guided workflow for creating collection jobs. This is the simplest method and is ideal for ad-hoc collection needs or for learning the system. The GUI is part of the infrastructure management platform and provides forms for specifying all job parameters.

Creating Jobs via the API

For programmatic and repeatable job creation, the API accepts JSON payloads that define every aspect of a collection job. This method is essential for integrating collection into automation pipelines.

Creating Jobs via Automation Tools

For large-scale deployments, automation tools can be used to create collection jobs programmatically, enabling infrastructure-as-code approaches to telemetry collection management.

Anatomy of a Collection Job Payload

The API payload for creating a collection job (createCollectionJob) consists of five major sections:

1. Device Set

The device set defines which network devices are in scope for the collection job. Devices can be specified in two ways:

  • By individual device — using an array of device_ids identified by their Universal Identifiers (UUID). Device UUIDs can be retrieved either through the GUI or via a nodes API query.
  • By device group — using a device group ID identified by device "tags" configured in the management UI. This is useful for collecting from logical groups of devices (e.g., all core routers or all branch switches).
! Device UUIDs are retrieved from the management platform
! Example: querying nodes API to get device UUIDs
! Each device in the network has a unique UUID assigned

2. Sensor Input Configuration

The sensor_input_configs section is a list of sensor definitions, each containing:

  • sensor_data — the specific data to collect (varies by protocol type)
  • cadence — the collection frequency in milliseconds

The sensor data field accepts one of seven types depending on the collection protocol:

Sensor TypeProtocolData Specification
snmp_sensorSNMPMIB name, table, or scalar variable
cli_sensorCLICLI command string
mdt_sensorMDTYANG XPath expression
gnmi_sensorgNMIYANG XPath expression
gnmi_standard_sensorgNMI (standard)YANG XPath expression
trap_sensorSNMP TRAPTrap OID
syslog_sensorSyslogSeverity and Facility numbers

A single collection job can include multiple sensor data entries for different sensors, allowing you to collect multiple metrics in a single job.

3. Sensor Output Configuration

The sensor_output_configs section defines where collected data should be sent. It is also a list, and each entry contains:

  • sensor_data — must match the corresponding sensor data from the input configuration
  • destination — the Kafka or gRPC server to receive the data

The destination is composed of two identifiers:

  • destination_id — the UUID of the Kafka or gRPC server, previously configured under Data Gateway Global Settings in the GUI
  • context_id — the Kafka topic name. If the destination is a gRPC server, the context_id is ignored

Pro Tip: Only one destination can be defined per sensor_data entry. If you need to send the same sensor data to multiple destinations, you will need to create separate collection jobs or leverage the collect-once-distribute-many paradigm described later in this article.

4. Application Context

The application context serves as the unique identifier for each collection job within the system. It consists of two user-defined strings:

  • context_id — a string identifying the collection context
  • application_id — a string identifying the application requesting the collection

Together, these values form a unique job identifier. Duplicate application contexts are not allowed — each collection job must have a unique combination of context_id and application_id.

5. Collection Mode

The collection mode specifies which protocol to use for the collection job. This determines how the Data Gateway communicates with the target devices to gather the requested sensor data.

Payload Example Walkthrough

A complete collection job payload brings all five sections together. Here is the logical flow of a typical payload:

  1. Specify the Node UUID of the target device
  2. Define the YANG Data Model and Data Tree Path for the sensor input
  3. Set the Kafka UUID as the destination
  4. Specify the Kafka Topic name for data routing
  5. Assign a unique Application Context for job identification

Each of these elements maps directly to a field in the JSON payload, making it straightforward to construct jobs programmatically once you understand the structure.

How to Monitor Multivendor Telemetry Collection Jobs

Effective monitoring is essential for maintaining reliable telemetry collection at scale. The platform provides comprehensive monitoring capabilities on both the ingress (collection) and egress (distribution) sides.

Understanding Ingress and Egress Monitoring

Monitoring is implemented on both sides of the data pipeline:

  • Ingress monitoring — tracks incoming messages from network devices to the Data Gateway
  • Egress monitoring — tracks outgoing messages from the Data Gateway to the messaging bus (Kafka or gRPC)

This dual-sided monitoring is available through both the API and the GUI and applies to both internally initiated and customer-initiated collections.

Collection Monitoring

The collection monitoring interface provides a hierarchical view:

  1. Jobs List — a top-level view of all active collection jobs and their status
  2. Per-Job Device List — drill down into any job to see individual devices, including devices impacted by collection issues
  3. Per-Device Collection Metrics — detailed metrics for each device, including the collection protocol in use, success rates, and error information

This hierarchy makes it easy to identify exactly which devices are experiencing collection problems and what protocol-level issues may be causing them.

Distribution Monitoring

Distribution monitoring provides per-destination metrics that track whether collected data is successfully reaching its configured Kafka or gRPC endpoint.

A critical behavior to understand is the independence of collection and distribution status:

  • If a destination (Kafka or gRPC server) becomes unreachable, collection from network devices will still be reported as successful
  • However, distribution will be reported as failed
  • The overall job status will be marked as degraded

This distinction is important for troubleshooting. A degraded job status does not necessarily mean that data collection from devices has failed — it may mean that the downstream messaging bus is experiencing issues while collection continues normally.

Pro Tip: When investigating degraded collection jobs, always check both the collection and distribution monitoring views independently. The root cause is often on the distribution side (e.g., a Kafka broker that is down) rather than the collection side.

Inside the Data Gateway Message Stream

Understanding the format and structure of Data Gateway messages is essential for building applications that consume collected telemetry data.

Message Format

Data Gateway messages follow the Google Protocol Buffers (protobuf) definition. The proto files that define the message schema can be compiled into client libraries for multiple programming languages, making it possible to build consumers in Python, Go, Java, and others.

These proto files must be used to parse messages that the Data Gateway posts to the Kafka or gRPC messaging bus. Without the correct protobuf definitions, consumer applications will not be able to deserialize the message payload.

Message Structure

Each Data Gateway message contains a header and a payload:

Message Header includes:

  • Node name — the hostname of the device that produced the data
  • Node UUID — the unique identifier of the source device
  • Collection start and end times — timestamps bracketing the collection interval
  • Sensor data — identifies which sensor path produced this data
  • Application Context and ID — maps the message back to the originating collection job

Message Payload:

The payload contains the actual collected data in the format dictated by the collection protocol and encoding.

Customer Application Integration

The Data Gateway does not offer direct integration options with customer applications. Instead, it requires an external messaging bus — either Kafka or gRPC — as an intermediary.

The integration flow is:

  1. Data Gateway collects data from network devices
  2. Data Gateway publishes messages to the Kafka topic or gRPC server
  3. Customer applications consume messages from the messaging bus

The message format on the bus can be either PROTO (Protocol Buffers binary) or JSON, depending on the configuration.

If customer applications need data retention (historical queries rather than real-time streaming), they should implement an intermediate data lake between the messaging bus and the application. The messaging bus itself is a transit layer, not a storage layer.

The Collect Once, Distribute Many Paradigm in Multivendor Telemetry

One of the most powerful optimization features of the collection architecture is the collect once, distribute many paradigm. This feature dramatically reduces the load on network devices by ensuring that each data path is collected only once, regardless of how many applications request it.

How Cadence Optimization Works

When multiple applications request the same sensor path from the same device but at different cadences, the system applies intelligent optimization:

  1. Input-side optimization — the Data Gateway collects at the lowest cadence (highest frequency) configured across all requesting applications
  2. Output-side distribution — each destination receives data at its requested cadence, as long as it is a multiple of the input-side cadence
  3. Rounding behavior — if an output cadence is not an exact multiple of the input cadence, the output is rounded down to the nearest multiple

Practical Example

Consider three applications requesting the same sensor path from the same device:

JobDeviceSensorDestinationRequested Cadence
Job ADevice 10Sensor XDestination 15 seconds
Job BDevice 10Sensor XDestination 225 seconds
Job CDevice 10Sensor XDestination 343 seconds

Without optimization, the Data Gateway would create three separate collection streams to the same device for the same data, tripling the load on the device. With the collect-once paradigm:

Input Stage:

  • Collection cadence is set to 5 seconds (the minimum of 5, 25, and 43)
  • Only one collection stream is created to the device

Output Stage:

  • Destination 1 receives data every 5 seconds (exact multiple of 5)
  • Destination 2 receives data every 25 seconds (exact multiple of 5)
  • Destination 3 receives data every 40 seconds (43 rounded down to nearest multiple of 5)

Cadence Removal Behavior

When a collection job with the lowest cadence is removed, the system automatically adjusts:

  • The next lowest cadence among remaining jobs becomes the new collection cadence
  • Distribution cadences for remaining destinations are recalculated based on the new input cadence

This dynamic adjustment ensures that collection always operates at the optimal frequency without manual intervention.

Pro Tip: When designing your collection jobs, be intentional about cadence values. Using cadences that are multiples of a common base (e.g., 5, 10, 15, 30 seconds) ensures that the rounding behavior does not cause unexpected deviations from your desired collection frequency.

Building a Python Consumer for Multivendor Telemetry Data

Once collection jobs are running and data is flowing to the messaging bus, the next step is building consumer applications that process the collected data. The reference architecture supports implementing consumers in any programming language that can compile Protocol Buffers definitions.

Consumer Architecture

A simple consumer application follows this pattern:

  1. Compile the protobuf definitions — use the Data Gateway proto files to generate language-specific message classes
  2. Connect to the messaging bus — establish a connection to the Kafka cluster or gRPC server
  3. Subscribe to topics — listen on the Kafka topic (context_id) configured in the collection job's output configuration
  4. Deserialize messages — use the compiled protobuf classes to parse incoming messages
  5. Process data — extract the relevant metrics from the message payload and forward them to your analytics pipeline, database, or alerting system

Integration with Data Storage

For applications that require historical data analysis rather than just real-time streaming, an intermediate data lake should be implemented between the messaging bus and the application layer. Common architectures include:

  • Kafka to InfluxDB — for time-series metrics storage and visualization with Grafana
  • Kafka to Elasticsearch — for log-style data with Kibana dashboards
  • gRPC to custom database — for specialized analytics applications

The messaging bus serves as a transit layer, not a persistent storage layer. Customer applications must implement their own data retention strategy.

Optimizing Collection Jobs for Scale

As your telemetry collection deployment grows, optimization becomes critical for maintaining performance and minimizing impact on network devices.

Job Consolidation

Rather than creating individual collection jobs for each metric on each device, consolidate related sensors into fewer, broader jobs. Each collection job can contain multiple sensor input configurations, allowing you to collect several metrics from the same device in a single job.

Device Grouping

Use device tags and group-based device sets rather than individual device UUIDs when possible. This approach:

  • Simplifies job management as devices are added or removed
  • Enables automatic inclusion of new devices that match the group criteria
  • Reduces the number of API calls needed to manage collection

Cadence Planning

Align collection cadences across applications to maximize the benefit of the collect-once paradigm:

  • Use base cadences that are common factors (e.g., 5, 10, 30 seconds)
  • Avoid prime-number cadences that do not align well with other applications
  • Consider the minimum cadence your network devices can support without performance impact

Monitoring at Scale

At scale, proactive monitoring of both collection and distribution health is essential. Set up alerting on:

  • Degraded job status — indicates distribution issues even when collection succeeds
  • Failed collection metrics — indicates device-level issues (unreachable devices, authentication failures, unsupported paths)
  • Cadence drift — monitor whether actual collection frequency matches the configured cadence

Frequently Asked Questions

What protocols does multivendor telemetry collection support?

The solution supports six collection protocols: SNMP (polling with standard and proprietary MIBs), Model-Driven Telemetry (MDT), gNMI, CLI, SNMP TRAP, and Syslog. Each protocol is implemented as a plugin within the Data Gateway, and a single collection deployment can use all protocols simultaneously across different devices and collection jobs.

Can I collect telemetry from non-Cisco devices?

Yes, multivendor device support is a core capability. SNMP, gNMI, SNMP TRAP, and Syslog collection work with third-party devices that support these standard protocols. MDT collection works with devices that support YANG-based streaming telemetry. CLI collection, however, is currently limited to Cisco devices — third-party CLI collection would require additional customization.

How does collected data reach my custom applications?

The Data Gateway does not provide direct integration with customer applications. Instead, it publishes collected data to an external messaging bus — either Kafka or gRPC. Your applications connect to the messaging bus to consume data. Messages follow the Google Protocol Buffers (protobuf) format and must be deserialized using the Data Gateway proto files, which can be compiled for multiple programming languages. If you need data retention for historical analysis, you should implement an intermediate data lake between the messaging bus and your application.

What happens when a Kafka broker goes down?

If the distribution destination (Kafka or gRPC server) becomes unreachable, the Data Gateway continues collecting data from network devices — collection is reported as successful. However, distribution is reported as failed, and the overall job status is marked as degraded. This design ensures that temporary messaging bus outages do not disrupt the collection process itself. Once the destination recovers, distribution resumes.

How does the collect-once paradigm reduce device load?

When multiple applications request the same sensor path from the same device at different cadences, the Data Gateway collects only once at the lowest configured cadence (highest frequency). It then distributes data to each destination at the appropriate cadence. For example, if three applications request the same interface counters at 5, 25, and 43-second intervals, the device is polled only every 5 seconds. Destination 1 gets every sample, Destination 2 gets every fifth sample, and Destination 3 gets every eighth sample (43 rounded down to 40, which is 8 times 5).

Can I create collection jobs programmatically?

Yes, collection jobs can be created through the API using structured JSON payloads, through the GUI for manual operations, or through automation tools for large-scale deployments. The API approach is essential for integrating telemetry collection into infrastructure-as-code workflows and CI/CD pipelines. Each payload defines the device set, sensor inputs, output destinations, application context, and collection mode.

Conclusion

Building a unified multivendor telemetry collection architecture is essential for any organization managing a diverse network infrastructure. The key takeaways from this guide are:

  1. Protocol flexibility matters — supporting SNMP, MDT, gNMI, CLI, TRAP, and Syslog through a single platform eliminates the need for protocol-specific collection tools
  2. Decoupled architecture scales — separating the collection layer from the application layer enables horizontal scaling and N+M redundancy
  3. Structured payloads enable automation — understanding the five sections of a collection job payload (device set, sensor input, sensor output, application context, and collection mode) is the foundation for programmatic job management
  4. Dual-sided monitoring prevents blind spots — independently monitoring collection and distribution health reveals issues that a single status indicator would hide
  5. Collect once, distribute many saves resources — intelligent cadence optimization minimizes device load while satisfying multiple application requirements

As networks continue to grow in complexity and vendor diversity, mastering telemetry collection becomes an increasingly valuable skill for network engineers and automation professionals. The concepts covered in this article — from payload construction to cadence optimization — apply broadly across any environment where centralized, multivendor data collection is needed.

To deepen your understanding of network automation and telemetry, explore the hands-on courses available at NHPREP that cover related topics including gRPC, gNMI, model-driven telemetry, and network programmability.