Lesson 3 of 7

Unsupervised Learning — Anomaly Detection

Objective

In this lesson you will configure routers to export flow telemetry (NetFlow) and use on-device flow inspection to create the data feed that an unsupervised learning system can cluster for anomaly detection. You will see how to enable flow collection, verify exported flow records, generate baseline traffic, simulate a traffic anomaly, and inspect flow state locally. This matters in production because flow telemetry is the raw input for ML models that automatically detect unusual behavior such as DDoS, scanning, or data exfiltration — enabling faster, data-driven responses.

Quick Recap

We continue with the topology used in Lesson 1: two edge routers (R1 and R2), a distribution switch (SW1), and two hosts (HostA, HostB). For this lesson we add a flow collector / ML appliance (Collector) that receives NetFlow export. No other devices or IPs from the original topology are changed.

ASCII topology (exact IPs shown on every interface)

                     +-----------------+
                     |     Collector   |
                     | 192.168.100.10  |
                     +--------+--------+
                              |
                       192.168.100.1/24
                              |
                     +--------+--------+
                     |       R1        |
                     | Gi0/0: 10.0.0.1 |
                     | Gi0/1: 192.168.100.1 |
                     +---+--------+----+
                         |        |
        10.0.0.10/24 --- SW1     R2 (10.0.1.1/24)
                         |
        10.0.0.11/24 --- HostA HostB

Device table

DeviceRoleKey IP(s)
R1Edge router / exporterGi0/0: 10.0.0.1, Gi0/1: 192.168.100.1
R2Upstream routerGi0/0: 10.0.1.1
SW1Distribution switch(Layer 2)
HostAClient10.0.0.10
HostBServer10.0.0.11
CollectorNetFlow / ML engine192.168.100.10

Key: we will configure NetFlow on R1 to export to Collector (192.168.100.10). R1’s Gi0/1 is the path to the collector and upstream traffic.

Key Concepts (theory + practical)

  • NetFlow (flow telemetry) — NetFlow is a router feature that aggregates packet headers into flow records (5-tuple + counters). In production networks NetFlow is commonly used as the data source for analytics and ML systems because it summarizes traffic with low overhead while preserving useful signals (source/dest IP, ports, bytes, packets, protocol, timestamps).

    When NetFlow export is enabled, the router periodically sends template records (NetFlow v9/IPFIX) and the aggregated flow records to the collector. The collector reconstructs flows and timestamps for downstream analysis.

  • Unsupervised learning & clustering — Algorithms like k-means or DBSCAN group flow records by similarity (e.g., byte/packet counts, duration, ports, protocol). Anomalies are points or clusters that diverge from the baseline.

    In production, clustering helps flag outliers (e.g., a host suddenly sending 10x bytes to many destinations).

  • Why sample vs. full flow — Full NetFlow captures every flow but can be heavy at high throughput. Sampling reduces volume but also reduces signal fidelity. For anomaly detection, choose sampling carefully: too aggressive sampling can hide short-lived anomalies (e.g., port scans).

    Use sampled NetFlow where capacity is limited; use unsampled where full fidelity is required.

  • On-device inspection — Routers provide immediate visibility via commands like show ip cache flow — useful for quick investigations before the ML pipeline processes telemetry.

    On-device checks are fast for operational troubleshooting while ML systems give long-term, statistical detection.

  • Protocol behavior — NetFlow v9/IPFIX uses templates: the exporter sends a template that describes the record format; the collector must receive the template before interpreting data. Template retransmission frequency matters to ensure collectors that restarted can re-learn format.

Step-by-step configuration

Step 1: Configure NetFlow export on R1

What we are doing: We enable NetFlow export on R1 so that the router sends flow records to the Collector (192.168.100.10). This supplies the ML appliance with the telemetry it needs for clustering and anomaly detection.

R1# configure terminal
R1(config)# ip flow-export destination 192.168.100.10 2055
R1(config)# ip flow-export version 9
R1(config)# ip flow-export template timeout-rate 1
R1(config)# ip flow-cache timeout active 60
R1(config)# exit
R1# write memory

What just happened:

  • ip flow-export destination 192.168.100.10 2055 defines the collector IP and UDP port where NetFlow data is sent.
  • ip flow-export version 9 selects NetFlow v9 (template-based), which is widely used and flexible for ML systems.
  • ip flow-export template timeout-rate 1 reduces the interval for template updates so a restarted collector will quickly receive a template.
  • ip flow-cache timeout active 60 sets active flow export granularity (seconds) — active flows are exported every 60s. This balances timeliness vs. export volume.

Real-world note: In production, coordinate the export port and template behavior with the collector team; mismatched ports or missing templates appear as “uninterpretable” data on the collector.

Verify:

R1# show ip flow export
Exporting flows to 192.168.100.10 (via GigabitEthernet0/1)
  VRF: (none)
  Export format: NetFlow v9
  Exporter: 192.168.100.10
  Destination Port: 2055
  Templates were sent 5 times
  Template data records sent: 5
  Flow records exported: 0
  Uptime for flow exporter: 00:05:12
  Export packets sent: 5
  No export access-lists configured

Step 2: Enable NetFlow accounting on interfaces

What we are doing: NetFlow only collects flows on interfaces where accounting is enabled. We turn on NetFlow (flow caching) on the interface(s) that see traffic to/from hosts and upstream.

R1# configure terminal
R1(config)# interface GigabitEthernet0/0
R1(config-if)# ip route-cache flow
R1(config-if)# exit
R1(config)# interface GigabitEthernet0/1
R1(config-if)# ip route-cache flow
R1(config-if)# exit
R1# write memory

What just happened:

  • ip route-cache flow enables NetFlow caching on the specified interface. Packets traversing the interface will be evaluated and aggregated into flow entries. Enabling both the LAN-facing and upstream interfaces ensures we capture both ingress/egress flow views.

Real-world note: In high-throughput interfaces consider sampled NetFlow or dedicated flow processors; enabling flow on all interfaces without planning can impact CPU on lower-end routers.

Verify:

R1# show running-config interface GigabitEthernet0/0
Building configuration...

Current configuration : 91 bytes
!
interface GigabitEthernet0/0
 ip address 10.0.0.1 255.255.255.0
 ip route-cache flow
end

R1# show running-config interface GigabitEthernet0/1
Building configuration...

Current configuration : 98 bytes
!
interface GigabitEthernet0/1
 ip address 192.168.100.1 255.255.255.0
 ip route-cache flow
end

Step 3: Generate baseline traffic (ping between hosts)

What we are doing: We create normal baseline-like traffic so the router produces typical flow records. Baseline data is what unsupervised models use to learn normal behavior.

R1# ping 10.0.0.11 repeat 20
Type escape sequence to abort.
Sending 20, 100-byte ICMP Echos to 10.0.0.11, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (20/20), round-trip min/avg/max = 1/2/10 ms

What just happened: The router (R1) generated 20 ICMP flows to HostB. Each ping session creates flow entries (ICMP as a protocol) which NetFlow aggregates and exports to the collector at the configured active timeout.

Real-world note: Use representative baseline traffic (web, DNS, file transfer) so the ML model learns typical behavior across multiple protocols and flows.

Verify:

R1# show ip cache flow
IP packet size distribution
  64 bytes:         20 packets
  576 bytes:        0 packets
  1500 bytes:       0 packets

Protocol statistics:
  ICMP: packets 20, bytes 2400
  TCP:  packets 0, bytes 0
  UDP:  packets 0, bytes 0

Flow TTL: 64
Flow timeout: active 60 minutes, inactive 15 seconds

Cache hash: 8 buckets, 0 collisions

TCP flows:
  None

UDP flows:
  None

ICMP flows:
  SrcIf  SrcIP        DstIf  DstIP         Pkts  Bytes

  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/0 10.0.0.11  20    2400

Step 4: Simulate an anomaly — traffic spike / scan

What we are doing: We simulate an anomalous event by generating many short-lived flows to many destinations (a typical scan or DDoS symptom). This produces a burst of flow records that should appear as outliers when clustered.

R1# ping 10.0.1.1 repeat 200
Type escape sequence to abort.
Sending 200, 100-byte ICMP Echos to 10.0.1.1, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (200/200), round-trip min/avg/max = 3/5/25 ms

What just happened: A large number of ICMP flows were generated rapidly toward R2 (or across many targets if you script multiple destinations). These short-lived, high-count flows produce many NetFlow records quickly, which the exporter will send to the collector during active timeout intervals and when flows age out.

Real-world note: Attackers often create large numbers of short flows across many ports and destinations — clustering algorithms usually spot a sudden increase in distinct flows or a cluster of high-volume flows as anomalous.

Verify:

R1# show ip cache flow
IP packet size distribution
  64 bytes:         220 packets
  576 bytes:        0 packets
  1500 bytes:       0 packets

Protocol statistics:
  ICMP: packets 220, bytes 26400
  TCP:  packets 0, bytes 0
  UDP:  packets 0, bytes 0

Flow TTL: 64
Flow timeout: active 60 minutes, inactive 15 seconds

Cache hash: 8 buckets, 1 collisions

ICMP flows:
  SrcIf  SrcIP        DstIf  DstIP         Pkts  Bytes

  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/1 10.0.1.1  200   24000
  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/0 10.0.0.11  20    2400

Also verify exporter statistics increased:

R1# show ip flow export
Exporting flows to 192.168.100.10 (via GigabitEthernet0/1)
  VRF: (none)
  Export format: NetFlow v9
  Exporter: 192.168.100.10
  Destination Port: 2055
  Templates were sent 6 times
  Template data records sent: 6
  Flow records exported: 420
  Uptime for flow exporter: 00:12:34
  Export packets sent: 12
  No export access-lists configured

Step 5: Use local top-talkers for immediate anomaly triage

What we are doing: Before or alongside ML detection, use the router’s local top-talkers inspection to quickly identify hosts generating abnormal flow counts. This helps validate whether the anomaly is real and where it originates.

R1# show ip cache flow
(IP flow cache output same as previous; for top-talkers run:)
R1# show ip cache flow | include 10.0.0.1|10.0.0.11|10.0.1.1
  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/1 10.0.1.1  200   24000
  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/0 10.0.0.11  20    2400

What just happened: You queried the flow cache and filtered for the IPs of interest. The output shows that 10.0.0.1 → 10.0.1.1 accounts for 200 packets / 24000 bytes — a clear spike versus the baseline to HostB. This quick triage can be used to create an alert or to provide context for the ML detection.

Real-world note: Router top-talkers are invaluable for incident response — they’re fast and available even if the collector or ML pipeline is delayed.

Verify:

R1# show ip cache flow
IP packet size distribution
  64 bytes:         220 packets
  576 bytes:        0 packets
  1500 bytes:       0 packets

Protocol statistics:
  ICMP: packets 220, bytes 26400

ICMP flows:
  SrcIf  SrcIP        DstIf  DstIP         Pkts  Bytes
  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/1 10.0.1.1  200   24000
  GigabitEthernet0/0 10.0.0.1  GigabitEthernet0/0 10.0.0.11  20    2400

Verification Checklist

  • Check 1: NetFlow exporter configured and enabled — verify with show ip flow export and confirm Exporter IP and Flow records exported count.
  • Check 2: Flow accounting turned on for relevant interfaces — verify with show running-config interface GigabitEthernet0/0 and show running-config interface GigabitEthernet0/1 showing ip route-cache flow.
  • Check 3: Flow cache contains entries representing baseline and anomaly traffic — verify with show ip cache flow and inspect flows, packet and byte counters.

Common Mistakes

SymptomCauseFix
show ip flow export shows 0 flows exported even though there is trafficNetFlow not enabled on interfaces (no ip route-cache flow)Enable ip route-cache flow on the interfaces that see traffic
Collector receives data but cannot interpret recordsExporter using NetFlow v9 but collector lacking templates (or templates timeout)Ensure collector is listening on the configured UDP port and ip flow-export template timeout-rate is low enough; confirm collector supports v9/IPFIX
High CPU on router after enabling flowFlow enabled on high-throughput interface without sampling on a low-capability routerConsider sampled NetFlow or hardware-assisted flow, or enable on fewer interfaces
Exported flow counts are lower than expectedActive timeout too long or flows are aggregated differentlyReduce ip flow-cache timeout active for more frequent exports, or adjust flow aging

Key Takeaways

  • NetFlow is the fundamental telemetry feed used by ML systems for unsupervised clustering and anomaly detection — get the exporter and interface accounting right first.
  • Unsupervised learning (clustering) looks for deviations from the baseline; proper baseline data (representative traffic) is crucial.
  • On-device commands (show ip cache flow, show ip flow export) are essential for quick triage and to confirm the router-side telemetry pipeline before trusting downstream ML outputs.
  • In production, coordinate exporter format, port, and template behavior with the collector team; consider sampling vs. fidelity trade-offs for high-throughput links.

Tip: After finishing this lesson, hand the exported NetFlow data to your ML team or a lab collector and use clustering (k-means or density-based) to visualize flow clusters and validate that the simulated anomaly appears as an outlier.