Secure Firewall Clustering: Deployments and Enhancements

Introduction

Imagine your organization's perimeter firewall is processing traffic at near-maximum capacity, and a single hardware failure could bring down the entire security posture of your network. How do you scale throughput beyond the limits of a single appliance while maintaining stateful failover? The answer lies in firewall clustering — a technology that combines multiple physical or virtual firewalls into a single logical device, delivering both massive throughput and built-in redundancy.

Firewall clustering has been a cornerstone of enterprise network security for years, and the evolution of this technology continues to push boundaries. Modern deployments can achieve up to 1.8 Tbps of throughput with AVC and IPS enabled, spanning hardware appliances, private cloud hypervisors, and public cloud instances across AWS, Azure, and GCP. Whether you are designing a data center security architecture, preparing for a CCIE Security lab, or managing a production FTD cluster, understanding the deployment models, node roles, traffic flow mechanics, and recent enhancements is essential.

In this article, we will walk through every major aspect of Cisco Secure Firewall clustering: the fundamental architecture, cluster modes, cloud deployments with auto-scaling, hardware model considerations, integration with ACI and VPN, health monitoring, and backup and restore strategies. Every technical detail presented here is drawn from authoritative reference material so you can trust the accuracy of what you read.

What Is Firewall Clustering and Why Does It Matter?

At its core, firewall clustering combines multiple firewalls into a single logical device. This approach delivers two critical benefits:

Increased Throughput — By distributing traffic processing across multiple nodes, a cluster can handle far more bandwidth than any individual appliance.
Redundancy with Stateful Failover — If one node fails, the remaining nodes continue processing traffic without dropping established connections.

However, clustering is not free. Approximately 20% of capacity is consumed by overhead, which includes:

Cluster Control Link (CCL) traffic between nodes
CPU and memory overhead for cluster coordination
Flow creation and state synchronization processes

Despite this overhead, the raw throughput numbers are remarkable. A 16-node cluster of 4245 appliances can achieve 1.79 Tbps with AVC and IPS enabled, where each individual node contributes approximately 140 Gbps. Alternatively, a 4-node cluster of 6170 appliances can reach 1.8 Tbps, with each node contributing 570 Gbps. A 16-node 6100 cluster is also upcoming, which will push these numbers even further.

Pro Tip: When sizing a cluster, always account for the ~20% overhead. Your effective throughput per node will be lower than the standalone specification due to CCL traffic and coordination costs.

How Do Cluster Nodes Work? Understanding Roles and States

A firewall cluster consists of one control node and one or more data nodes. All nodes process transit connections, but the control node carries additional responsibilities.

The Control Node

The control node is elected through the cluster control protocol and serves as the single authority for:

Configuration replication to all data nodes
Cluster control protocol management
Control node election coordination

There is exactly one control node in any cluster at any given time.

The Data Nodes

All remaining nodes in the cluster are data nodes. They process transit traffic just like the control node but defer to the control node for configuration and cluster management decisions.

Node States in Traffic Processing

When traffic flows through a cluster, each node can take on one of four states relative to a given flow:

State	Role
Owner	Receives the first packet of a flow and owns the connection
Backup Owner	Stores state information received from the owner for seamless transfer to a new owner if the original owner fails
Flow Director	Keeps track of the flow owner so other nodes can locate it
Flow Forwarder	Forwards packets to the owner when they arrive at the wrong node

These states are assigned per-flow, not per-node. A single node can simultaneously be the owner of one flow, the backup of another, and the forwarder of yet another.

How Does Firewall Clustering Handle Asymmetric Traffic Flows?

One of the most important challenges in firewall clustering is handling asymmetric flows — situations where packets belonging to the same connection arrive at different cluster nodes. The cluster handles TCP and UDP asymmetric flows differently.

TCP Asymmetric Flow Handling

Consider a scenario with three cluster nodes (A, B, and C), a client, and a server:

The client sends a TCP SYN, which arrives at Node A.
Node A becomes the Owner, adds a TCP SYN Cookie to the packet, and delivers it to the server.
The server responds with a TCP SYN-ACK, which arrives at Node C (a different node — this is the asymmetric part).
Node C reads the TCP SYN Cookie, identifies Node A as the owner, redirects the packet to Node A, and becomes the Flow Forwarder.
Node A delivers the TCP SYN-ACK to the client.
Node A updates the Flow Director (Node B) with the flow ownership information.

The TCP SYN Cookie is the key mechanism that enables the cluster to reunite asymmetric TCP flows without requiring a director lookup on the initial SYN-ACK.

UDP Asymmetric Flow Handling

UDP and other pseudo-stateful connections follow a different process because there is no SYN Cookie mechanism:

The client sends a UDP packet, which arrives at Node A.
Node A queries the Flow Director (Node B) to check if this flow already has an owner.
The director responds: not found (it is a new connection).
Node A becomes the Owner, delivers the packet to the server, and updates the director.
The server responds, and the response arrives at Node C.
Node C queries the Flow Director (Node B).
The director returns the owner: Node A.
Node C redirects the packet to Node A and becomes the Flow Forwarder.
Node A delivers the response to the client.

Starting with version 10.0.0, a feature called Cluster Redirect flow offload further optimizes this process by reducing the overhead of repeated redirections for long-lived flows.

Pro Tip: Understanding the difference between TCP and UDP asymmetric flow handling is critical for troubleshooting cluster performance issues. TCP uses SYN Cookies for fast owner identification, while UDP requires director queries, adding a small amount of latency to the first few packets.

Firewall Clustering Modes: Individual vs. Spanned

Cisco Secure Firewall clusters support two distinct interface modes, each suited to different network architectures.

Individual Mode

In Individual Mode, each cluster node maintains its own unique IP addresses on each interface. Load balancing to the cluster is performed at Layer 3 using one of three mechanisms:

Policy-Based Routing (PBR)
Equal-Cost Multi-Path (ECMP)
Intelligent Traffic Director (ITD)

Individual mode operates in routed mode only and is supported on the following platforms:

FTDv (virtual)
3100 series
4200 series
6100 series

For example, in a three-node individual mode cluster, each node would have its own outside and inside IP addresses:

Node	Outside IP	Inside IP
Control	192.0.2.2	10.1.1.2
Data 1	192.0.2.3	10.1.1.3
Data 2	192.0.2.4	10.1.1.4

The cluster would share a virtual IP for management purposes: 192.0.2.1 (outside) and 10.1.1.1 (inside).

Spanned Mode

In Spanned Mode, the cluster presents a single shared IP address on each interface, and load balancing is performed at Layer 2 using spanned EtherChannels. The upstream and downstream switches distribute traffic across the cluster nodes via EtherChannel hashing.

Spanned mode supports both routed and transparent firewall modes and is available on:

3100 series
4200 series
6100 series

Note that FTDv does not support spanned mode — virtual deployments are limited to individual mode.

Feature	Individual Mode	Spanned Mode
Load Balancing Layer	Layer 3 (PBR/ECMP/ITD)	Layer 2 (EtherChannel)
Firewall Modes	Routed only	Routed and Transparent
IP Addressing	Unique per node	Single shared IP
Supported Platforms	FTDv, 3100, 4200, 6100	3100, 4200, 6100

Pro Tip: Choose individual mode when you need Layer 3 flexibility and are working with virtual firewalls. Choose spanned mode when your switching infrastructure supports EtherChannel and you need transparent mode support.

Cluster Control Link: The Backbone of Firewall Clustering

The Cluster Control Link (CCL) is a dedicated interface (or set of interfaces) that carries both control and data traffic between cluster nodes. It requires dedicated interfaces and cannot share interfaces with data traffic.

The CCL carries the following types of traffic:

Heartbeat packets for health monitoring
Cluster control protocol messages for data path coordination
Control node election communications
Configuration replication from the control node to data nodes
Data packets belonging to traffic flows that need to be forwarded to other units (asymmetric flow handling)
State replication for stateful failover
Connection ownership queries from flow directors and forwarders

CCL in Virtual Deployments (VXLAN)

In private and public cloud deployments, the CCL uses VXLAN encapsulation with unicast transport. Each virtual firewall node has a VTEP (VXLAN Tunnel Endpoint) and a corresponding VTEP source interface. Traffic between nodes is encapsulated in VNI (VXLAN Network Identifier) tunnels.

The VXLAN overhead adds 54 bytes to each packet. Combined with the standard cluster traffic overhead of 100 bytes, the total overhead is 154 bytes. Therefore:

CCL MTU = Data interface MTU + 154 bytes

This MTU requirement is critical. If the CCL MTU is not set correctly, cluster communication will experience fragmentation, leading to performance degradation or outright failures.

Firewall Clustering in the Private Cloud

Private cloud clustering supports individual mode only with a maximum of 16 nodes arranged in configurations such as 4 nodes across 4 hypervisors (4x4).

Key characteristics of private cloud clustering:

CCL uses unicast VXLAN transport
Nodes register to FMC first, then the cluster is built
Each hypervisor hosts one or more FTDv instances
VTEP source interfaces are configured on each virtual node
VXLAN peers must be defined between nodes

The deployment workflow differs from hardware clusters:

Deploy FTDv instances on each hypervisor
Register all nodes to FMC
Build the cluster from FMC
Configure policies and deploy

Firewall Clustering in the Public Cloud: AWS, Azure, and GCP

Public cloud clustering takes the same foundational concepts but adapts them to cloud-native networking constructs. Public cloud clusters support individual mode only with a maximum of 16 nodes.

A key difference from private cloud is the deployment workflow: in public cloud, the cluster is built first with a Day 0 configuration, and then nodes register to FMC using auto-registration.

Auto-Scaling

One of the most powerful features of public cloud clustering is auto-scaling, which automatically adds or removes firewall instances based on resource utilization:

Scale-out trigger: CPU or memory usage exceeds a threshold (e.g., 80%)
Scale-in trigger: CPU or memory usage drops below a threshold (e.g., 20%)

The monitoring and orchestration tools vary by cloud provider:

Cloud Provider	Monitoring Tool	Orchestration Tool	Cluster + Auto-Scale Support
AWS	CloudWatch	Lambda	Version 7.6
Azure	Azure Monitor	Functions and Logic Apps	Version 7.7
GCP	Cloud Monitoring	Cloud Functions	Version 10.0.0

When auto-scaling triggers a new instance:

A Day 0 configuration is automatically generated
A new vFTD instance is created
The instance is registered to FMC
Policy is deployed to the new node

The Public Cloud Tradeoff

An important architectural decision in public cloud deployments is whether to use clustering behind a load balancer or standalone firewalls:

Firewalls in a cluster — You get stateful failover but may sacrifice some throughput due to cluster overhead.
Firewalls not in a cluster (standalone behind a load balancer) — You get better throughput but lose stateful failover between instances.

Pro Tip: If your application requires session persistence and you cannot tolerate connection drops during failover, clustering is the right choice even with the throughput tradeoff. If your workloads are stateless or can handle reconnections, standalone instances behind a load balancer may deliver better raw performance.

GCP Cluster Deployments

GCP clustering supports both north-south and east-west traffic patterns:

North-South (Incoming): Traffic flows through an External Load Balancer (ELB) to the firewall cluster's target pool, then through an Internal Load Balancer (ILB) with NAT to reach server VMs.
North-South (Outgoing): The reverse path flows from server VMs through the ILB, through the cluster, and out via the ELB with NAT.
East-West: Traffic between application VMs (e.g., App1 and App2) flows through ILBs on both sides of the cluster, with health check probes ensuring only healthy nodes receive traffic.

AWS Cluster Deployments with Gateway Load Balancer (GWLB)

AWS deployments leverage the Gateway Load Balancer (GWLB) for transparent insertion of firewalls into the traffic path. The GWLB sits in a Service VPC and connects to application VPCs via Gateway Load Balancer Endpoints and Private Link.

Key benefits of the GWLB approach:

Transparent insertion of firewalls without modifying application routing
Stickiness of flows to ensure consistent processing
Target failover when a node becomes unhealthy

AWS supports three deployment patterns:

North-South Single Arm: Internet Gateway to GWLB Endpoint (Application VPC) to GWLB (Service VPC) with a target group containing the cluster instances.
North-South Dual Arm: Similar to single arm but with NAT in the application path.
East-West: Traffic between Application 1 VPC and Application 2 VPC routes through an AWS Transit Gateway with VPC attachments to the GWLB Endpoint and Service VPC.

AWS also supports Multi-AZ deployments, where the GWLB target group spans instances across AZ1, AZ2, and AZ3, providing both clustering redundancy and availability zone resilience.

Azure Cluster Deployments

Azure supports two primary firewall clustering patterns:

North-South with GWLB: A Public Load Balancer handles inbound traffic, which is chained to a Gateway Load Balancer in a separate VNET. The firewall cluster instances sit in a backend pool, and traffic is encapsulated using VXLAN over UDP. Internal servers receive traffic after inspection.
East-West with ILB: Traffic between application servers (App1 and App2) flows through Internal Load Balancers on each side of the firewall cluster, providing segmented inspection for lateral traffic.

Hardware Models and Configuration Workflows for Firewall Clustering

The configuration workflow for building a firewall cluster differs significantly between hardware platforms.

4100/9300 Series Workflow

For the legacy 4100 and 9300 chassis-based platforms, the configuration is split between FXOS and FMC:

Interface configuration is performed in FXOS
Cluster creation is performed in FXOS
Bootstrap configuration is applied in FXOS
Registration of the control node is done in FMC
Adding other nodes to the existing cluster is done in FMC
Remaining configuration (policies, NAT, etc.) is done in FMC

3100/4200/6100 Series Workflow

The newer 3100, 4200, and 6100 platforms simplify the process by moving everything to FMC:

Registration of all nodes is done in FMC
Cluster creation is done in FMC
All remaining configuration is done in FMC

This is a significant improvement over the 4100/9300 workflow, as it eliminates the need to manage configurations across two separate management planes (FXOS and FMC).

Step	4100/9300	3100/4200/6100
Interface Config	FXOS	FMC
Cluster Creation	FXOS	FMC
Bootstrap Config	FXOS	N/A
Node Registration	FMC (control first)	FMC (all nodes)
Policy Config	FMC	FMC

Split-Spanned EtherChannel for Multi-Site Deployments

For organizations with two physical sites, split-spanned EtherChannel enables a cluster to span both locations:

Each site has its own local data EtherChannel on the switch side
A single spanned data EtherChannel exists on the cluster side
The CCL is extended at Layer 2 between sites
Connections are localized to the site where they originate
Upstream switches use vPC or VSS for local redundancy

This design ensures that traffic stays local to each site whenever possible, minimizing inter-site CCL traffic while maintaining full cluster benefits.

Clustering in ACI Multi-Pod

Firewall clustering integrates with ACI Multi-Pod environments using:

Anycast addressing for consistent reachability
Split-spanned EtherChannel across ACI pods
IPN (Inter-Pod Network) connectivity for CCL traffic between pods

Firewall Clustering with VPN and Routing Protocols

Site-to-Site VPN in Clusters

VPN support in firewall clusters has evolved significantly. In versions prior to 10.0.0, site-to-site VPN tunnels were established only with the control node. This meant:

All VPN traffic had to be processed by a single node
The control node became a bottleneck for encrypted traffic
If the control node failed, VPN tunnels had to be re-established

Distributed Site-to-Site VPN (Version 10.0.0+)

Starting with version 10.0.0, distributed site-to-site VPN enables VPN tunnel endpoints on all cluster nodes, not just the control node. This dramatically improves VPN throughput and eliminates the single-node bottleneck.

Distributed VPN is supported on:

Models: 4200 and 6100
Mode: Spanned EtherChannel, Routed

The distributed VPN traffic flow works as follows:

A new VPN session arrives at Node A, which becomes the VPN Session Owner.
Node A decrypts the traffic and delivers it to the destination.
The Orchestrator (Control Node) ensures equal distribution of VPN sessions across nodes.
Node A chooses Node B as the Backup Session Owner.
If subsequent VPN traffic arrives at Node D (a different node), Node D becomes the Forwarder and redirects traffic to the owner (Node A).

The control node serves as the Orchestrator, ensuring that VPN sessions are evenly distributed across all cluster nodes for optimal load balancing.

Routing Protocol Peering Behavior

The behavior of routing protocol peering differs between cluster modes:

Spanned Mode: Only the control node establishes routing protocol peering with external routers. This simplifies the routing adjacency model.
Individual Mode: All nodes establish routing protocol peering with external routers, since each node has its own unique IP addresses.

How Do You Monitor Firewall Clustering Health?

Monitoring the health of a firewall cluster is critical for maintaining reliability. Two primary tools are available.

CLI Commands

The cluster exec command allows you to run commands across all cluster nodes simultaneously from any single node:

cluster exec show flow-offload statistics

cluster exec show interface

These commands execute on every node in the cluster and return consolidated output, making it easy to compare metrics across nodes without connecting to each one individually.

Health Monitoring Dashboard

The Health Monitoring Dashboard in FMC provides:

Ease of troubleshooting with a centralized view
Graphical view of events across all cluster nodes
Health status of each node with real-time indicators

The dashboard eliminates the need to run CLI commands on each node individually, giving administrators a single pane of glass for cluster health visibility.

Pro Tip: Make it a practice to review the Health Monitoring Dashboard regularly, not just during incidents. Catching a degraded node early can prevent a cascading failure scenario during a traffic spike.

Management Interface and Data Interface Considerations

Each node in a firewall cluster has a management interface that is used to manage the device individually. An important constraint to remember:

You cannot manage a clustered firewall via its data interface — management must go through the dedicated management interface.

This means your management network must have reachability to every cluster node's management interface independently. In large-scale deployments with many nodes, this requires careful management network design to ensure out-of-band access to all nodes.

The data interfaces are the interfaces that carry transit traffic (the traffic being inspected by the firewall). In spanned mode, these are configured as spanned EtherChannels. In individual mode, each node has its own data interfaces with unique IP addresses.

Backup and Restore Strategies for Firewall Clustering

While the reference material identifies backup and restore as a key topic area for cluster management, the fundamental principle is straightforward: because the control node replicates configuration to all data nodes, backing up the control node's configuration captures the cluster-wide policy. However, each node maintains its own local state (flow tables, connection entries), which is inherently transient and reconstructed through cluster synchronization after a restore.

Ensure that your backup procedures include:

FMC configuration backups (which include cluster definitions)
Individual node bootstrap configurations (especially for 4100/9300 platforms where FXOS configuration is separate)
Documentation of CCL interface assignments and MTU settings

Frequently Asked Questions

What is the maximum throughput achievable with firewall clustering?

The maximum throughput depends on the hardware model and cluster size. A 16-node cluster of 4245 appliances achieves approximately 1.79 Tbps with AVC and IPS enabled (about 140 Gbps per node). A 4-node cluster of 6170 appliances achieves approximately 1.8 Tbps (about 570 Gbps per node). A 16-node 6100 cluster is upcoming and will further increase the ceiling.

Should I use firewall clustering in a public cloud deployment?

It depends on your requirements. Clustering behind a public cloud load balancer provides stateful failover, meaning connections survive node failures. However, standalone firewalls (not clustered) behind a load balancer may deliver better raw throughput because they avoid the ~20% cluster overhead. If your applications require session persistence, choose clustering. If they are stateless, standalone instances may perform better.

What is the difference between individual mode and spanned mode?

Individual mode assigns unique IP addresses to each node and uses Layer 3 load balancing (PBR, ECMP, or ITD). It supports routed mode only and works with FTDv, 3100, 4200, and 6100 platforms. Spanned mode uses a single shared IP address with Layer 2 EtherChannel load balancing, supports both routed and transparent modes, and works with 3100, 4200, and 6100 platforms. FTDv does not support spanned mode.

How does distributed VPN work in a firewall cluster?

Available starting with version 10.0.0, distributed site-to-site VPN allows VPN tunnel endpoints on all cluster nodes rather than only the control node. The control node acts as an Orchestrator to ensure even distribution of VPN sessions. Supported on 4200 and 6100 models in spanned EtherChannel routed mode. Each session has an owner, a backup session owner, and potential forwarders for asymmetric traffic.

What MTU should I set for the Cluster Control Link in virtual deployments?

In virtual deployments using VXLAN for the CCL, the total overhead is 154 bytes (100 bytes for cluster traffic + 54 bytes for VXLAN encapsulation). Therefore, the CCL MTU should be set to data interface MTU + 154 bytes. For example, if your data interface MTU is 1500 bytes, the CCL MTU should be at least 1654 bytes.

Which cloud providers support auto-scaling with firewall clusters?

Auto-scaling with firewall clusters is supported on AWS (version 7.6), Azure (version 7.7), and GCP (version 10.0.0). Each provider uses its native monitoring and orchestration tools: AWS uses CloudWatch and Lambda, Azure uses Azure Monitor with Functions and Logic Apps, and GCP uses Cloud Monitoring with Cloud Functions.

Conclusion

Firewall clustering is one of the most powerful technologies available for scaling network security infrastructure while maintaining high availability. From the fundamental architecture of control and data nodes, through the nuanced handling of asymmetric TCP and UDP flows, to the diverse deployment options across hardware appliances, private cloud hypervisors, and public cloud instances on AWS, Azure, and GCP — mastering this technology is essential for any security engineer working at scale.

Key takeaways from this guide:

Clustering combines multiple firewalls into a single logical device, delivering up to 1.8 Tbps of throughput with stateful failover, at the cost of approximately 20% overhead.
Individual mode and spanned mode serve different architectural needs — choose based on your Layer 2/Layer 3 topology and platform support.
Public cloud deployments introduce auto-scaling capabilities and require careful consideration of the clustering vs. standalone tradeoff.
Distributed VPN (version 10.0.0+) eliminates the control-node bottleneck for site-to-site VPN, spreading tunnel endpoints across all nodes.
The Health Monitoring Dashboard provides critical visibility into cluster node status and should be part of your operational monitoring practice.

Whether you are designing a new data center security architecture or optimizing an existing FTD cluster deployment, the concepts and deployment patterns covered here will help you make informed decisions and build resilient, high-performance security infrastructure.

To deepen your knowledge and get hands-on practice with these technologies, explore the CCIE Security preparation resources available at NHPREP.