Secure Firewall Clustering: Deployments and Enhancements
Introduction
Imagine your organization's perimeter firewall is processing traffic at near-maximum capacity, and a single hardware failure could bring down the entire security posture of your network. How do you scale throughput beyond the limits of a single appliance while maintaining stateful failover? The answer lies in firewall clustering — a technology that combines multiple physical or virtual firewalls into a single logical device, delivering both massive throughput and built-in redundancy.
Firewall clustering has been a cornerstone of enterprise network security for years, and the evolution of this technology continues to push boundaries. Modern deployments can achieve up to 1.8 Tbps of throughput with AVC and IPS enabled, spanning hardware appliances, private cloud hypervisors, and public cloud instances across AWS, Azure, and GCP. Whether you are designing a data center security architecture, preparing for a CCIE Security lab, or managing a production FTD cluster, understanding the deployment models, node roles, traffic flow mechanics, and recent enhancements is essential.
In this article, we will walk through every major aspect of Cisco Secure Firewall clustering: the fundamental architecture, cluster modes, cloud deployments with auto-scaling, hardware model considerations, integration with ACI and VPN, health monitoring, and backup and restore strategies. Every technical detail presented here is drawn from authoritative reference material so you can trust the accuracy of what you read.
What Is Firewall Clustering and Why Does It Matter?
At its core, firewall clustering combines multiple firewalls into a single logical device. This approach delivers two critical benefits:
- Increased Throughput — By distributing traffic processing across multiple nodes, a cluster can handle far more bandwidth than any individual appliance.
- Redundancy with Stateful Failover — If one node fails, the remaining nodes continue processing traffic without dropping established connections.
However, clustering is not free. Approximately 20% of capacity is consumed by overhead, which includes:
- Cluster Control Link (CCL) traffic between nodes
- CPU and memory overhead for cluster coordination
- Flow creation and state synchronization processes
Despite this overhead, the raw throughput numbers are remarkable. A 16-node cluster of 4245 appliances can achieve 1.79 Tbps with AVC and IPS enabled, where each individual node contributes approximately 140 Gbps. Alternatively, a 4-node cluster of 6170 appliances can reach 1.8 Tbps, with each node contributing 570 Gbps. A 16-node 6100 cluster is also upcoming, which will push these numbers even further.
Pro Tip: When sizing a cluster, always account for the ~20% overhead. Your effective throughput per node will be lower than the standalone specification due to CCL traffic and coordination costs.
How Do Cluster Nodes Work? Understanding Roles and States
A firewall cluster consists of one control node and one or more data nodes. All nodes process transit connections, but the control node carries additional responsibilities.
The Control Node
The control node is elected through the cluster control protocol and serves as the single authority for:
- Configuration replication to all data nodes
- Cluster control protocol management
- Control node election coordination
There is exactly one control node in any cluster at any given time.
The Data Nodes
All remaining nodes in the cluster are data nodes. They process transit traffic just like the control node but defer to the control node for configuration and cluster management decisions.
Node States in Traffic Processing
When traffic flows through a cluster, each node can take on one of four states relative to a given flow:
| State | Role |
|---|---|
| Owner | Receives the first packet of a flow and owns the connection |
| Backup Owner | Stores state information received from the owner for seamless transfer to a new owner if the original owner fails |
| Flow Director | Keeps track of the flow owner so other nodes can locate it |
| Flow Forwarder | Forwards packets to the owner when they arrive at the wrong node |
These states are assigned per-flow, not per-node. A single node can simultaneously be the owner of one flow, the backup of another, and the forwarder of yet another.
How Does Firewall Clustering Handle Asymmetric Traffic Flows?
One of the most important challenges in firewall clustering is handling asymmetric flows — situations where packets belonging to the same connection arrive at different cluster nodes. The cluster handles TCP and UDP asymmetric flows differently.
TCP Asymmetric Flow Handling
Consider a scenario with three cluster nodes (A, B, and C), a client, and a server:
- The client sends a TCP SYN, which arrives at Node A.
- Node A becomes the Owner, adds a TCP SYN Cookie to the packet, and delivers it to the server.
- The server responds with a TCP SYN-ACK, which arrives at Node C (a different node — this is the asymmetric part).
- Node C reads the TCP SYN Cookie, identifies Node A as the owner, redirects the packet to Node A, and becomes the Flow Forwarder.
- Node A delivers the TCP SYN-ACK to the client.
- Node A updates the Flow Director (Node B) with the flow ownership information.
The TCP SYN Cookie is the key mechanism that enables the cluster to reunite asymmetric TCP flows without requiring a director lookup on the initial SYN-ACK.
UDP Asymmetric Flow Handling
UDP and other pseudo-stateful connections follow a different process because there is no SYN Cookie mechanism:
- The client sends a UDP packet, which arrives at Node A.
- Node A queries the Flow Director (Node B) to check if this flow already has an owner.
- The director responds: not found (it is a new connection).
- Node A becomes the Owner, delivers the packet to the server, and updates the director.
- The server responds, and the response arrives at Node C.
- Node C queries the Flow Director (Node B).
- The director returns the owner: Node A.
- Node C redirects the packet to Node A and becomes the Flow Forwarder.
- Node A delivers the response to the client.
Starting with version 10.0.0, a feature called Cluster Redirect flow offload further optimizes this process by reducing the overhead of repeated redirections for long-lived flows.
Pro Tip: Understanding the difference between TCP and UDP asymmetric flow handling is critical for troubleshooting cluster performance issues. TCP uses SYN Cookies for fast owner identification, while UDP requires director queries, adding a small amount of latency to the first few packets.
Firewall Clustering Modes: Individual vs. Spanned
Cisco Secure Firewall clusters support two distinct interface modes, each suited to different network architectures.
Individual Mode
In Individual Mode, each cluster node maintains its own unique IP addresses on each interface. Load balancing to the cluster is performed at Layer 3 using one of three mechanisms:
- Policy-Based Routing (PBR)
- Equal-Cost Multi-Path (ECMP)
- Intelligent Traffic Director (ITD)
Individual mode operates in routed mode only and is supported on the following platforms:
- FTDv (virtual)
- 3100 series
- 4200 series
- 6100 series
For example, in a three-node individual mode cluster, each node would have its own outside and inside IP addresses:
| Node | Outside IP | Inside IP |
|---|---|---|
| Control | 192.0.2.2 | 10.1.1.2 |
| Data 1 | 192.0.2.3 | 10.1.1.3 |
| Data 2 | 192.0.2.4 | 10.1.1.4 |
The cluster would share a virtual IP for management purposes: 192.0.2.1 (outside) and 10.1.1.1 (inside).
Spanned Mode
In Spanned Mode, the cluster presents a single shared IP address on each interface, and load balancing is performed at Layer 2 using spanned EtherChannels. The upstream and downstream switches distribute traffic across the cluster nodes via EtherChannel hashing.
Spanned mode supports both routed and transparent firewall modes and is available on:
- 3100 series
- 4200 series
- 6100 series
Note that FTDv does not support spanned mode — virtual deployments are limited to individual mode.
| Feature | Individual Mode | Spanned Mode |
|---|---|---|
| Load Balancing Layer | Layer 3 (PBR/ECMP/ITD) | Layer 2 (EtherChannel) |
| Firewall Modes | Routed only | Routed and Transparent |
| IP Addressing | Unique per node | Single shared IP |
| Supported Platforms | FTDv, 3100, 4200, 6100 | 3100, 4200, 6100 |
Pro Tip: Choose individual mode when you need Layer 3 flexibility and are working with virtual firewalls. Choose spanned mode when your switching infrastructure supports EtherChannel and you need transparent mode support.
Cluster Control Link: The Backbone of Firewall Clustering
The Cluster Control Link (CCL) is a dedicated interface (or set of interfaces) that carries both control and data traffic between cluster nodes. It requires dedicated interfaces and cannot share interfaces with data traffic.
The CCL carries the following types of traffic:
- Heartbeat packets for health monitoring
- Cluster control protocol messages for data path coordination
- Control node election communications
- Configuration replication from the control node to data nodes
- Data packets belonging to traffic flows that need to be forwarded to other units (asymmetric flow handling)
- State replication for stateful failover
- Connection ownership queries from flow directors and forwarders
CCL in Virtual Deployments (VXLAN)
In private and public cloud deployments, the CCL uses VXLAN encapsulation with unicast transport. Each virtual firewall node has a VTEP (VXLAN Tunnel Endpoint) and a corresponding VTEP source interface. Traffic between nodes is encapsulated in VNI (VXLAN Network Identifier) tunnels.
The VXLAN overhead adds 54 bytes to each packet. Combined with the standard cluster traffic overhead of 100 bytes, the total overhead is 154 bytes. Therefore:
CCL MTU = Data interface MTU + 154 bytes
This MTU requirement is critical. If the CCL MTU is not set correctly, cluster communication will experience fragmentation, leading to performance degradation or outright failures.
Firewall Clustering in the Private Cloud
Private cloud clustering supports individual mode only with a maximum of 16 nodes arranged in configurations such as 4 nodes across 4 hypervisors (4x4).
Key characteristics of private cloud clustering:
- CCL uses unicast VXLAN transport
- Nodes register to FMC first, then the cluster is built
- Each hypervisor hosts one or more FTDv instances
- VTEP source interfaces are configured on each virtual node
- VXLAN peers must be defined between nodes
The deployment workflow differs from hardware clusters:
- Deploy FTDv instances on each hypervisor
- Register all nodes to FMC
- Build the cluster from FMC
- Configure policies and deploy
Firewall Clustering in the Public Cloud: AWS, Azure, and GCP
Public cloud clustering takes the same foundational concepts but adapts them to cloud-native networking constructs. Public cloud clusters support individual mode only with a maximum of 16 nodes.
A key difference from private cloud is the deployment workflow: in public cloud, the cluster is built first with a Day 0 configuration, and then nodes register to FMC using auto-registration.
Auto-Scaling
One of the most powerful features of public cloud clustering is auto-scaling, which automatically adds or removes firewall instances based on resource utilization:
- Scale-out trigger: CPU or memory usage exceeds a threshold (e.g., 80%)
- Scale-in trigger: CPU or memory usage drops below a threshold (e.g., 20%)
The monitoring and orchestration tools vary by cloud provider:
| Cloud Provider | Monitoring Tool | Orchestration Tool | Cluster + Auto-Scale Support |
|---|---|---|---|
| AWS | CloudWatch | Lambda | Version 7.6 |
| Azure | Azure Monitor | Functions and Logic Apps | Version 7.7 |
| GCP | Cloud Monitoring | Cloud Functions | Version 10.0.0 |
When auto-scaling triggers a new instance:
- A Day 0 configuration is automatically generated
- A new vFTD instance is created
- The instance is registered to FMC
- Policy is deployed to the new node
The Public Cloud Tradeoff
An important architectural decision in public cloud deployments is whether to use clustering behind a load balancer or standalone firewalls:
- Firewalls in a cluster — You get stateful failover but may sacrifice some throughput due to cluster overhead.
- Firewalls not in a cluster (standalone behind a load balancer) — You get better throughput but lose stateful failover between instances.
Pro Tip: If your application requires session persistence and you cannot tolerate connection drops during failover, clustering is the right choice even with the throughput tradeoff. If your workloads are stateless or can handle reconnections, standalone instances behind a load balancer may deliver better raw performance.
GCP Cluster Deployments
GCP clustering supports both north-south and east-west traffic patterns:
- North-South (Incoming): Traffic flows through an External Load Balancer (ELB) to the firewall cluster's target pool, then through an Internal Load Balancer (ILB) with NAT to reach server VMs.
- North-South (Outgoing): The reverse path flows from server VMs through the ILB, through the cluster, and out via the ELB with NAT.
- East-West: Traffic between application VMs (e.g., App1 and App2) flows through ILBs on both sides of the cluster, with health check probes ensuring only healthy nodes receive traffic.
AWS Cluster Deployments with Gateway Load Balancer (GWLB)
AWS deployments leverage the Gateway Load Balancer (GWLB) for transparent insertion of firewalls into the traffic path. The GWLB sits in a Service VPC and connects to application VPCs via Gateway Load Balancer Endpoints and Private Link.
Key benefits of the GWLB approach:
- Transparent insertion of firewalls without modifying application routing
- Stickiness of flows to ensure consistent processing
- Target failover when a node becomes unhealthy
AWS supports three deployment patterns:
- North-South Single Arm: Internet Gateway to GWLB Endpoint (Application VPC) to GWLB (Service VPC) with a target group containing the cluster instances.
- North-South Dual Arm: Similar to single arm but with NAT in the application path.
- East-West: Traffic between Application 1 VPC and Application 2 VPC routes through an AWS Transit Gateway with VPC attachments to the GWLB Endpoint and Service VPC.
AWS also supports Multi-AZ deployments, where the GWLB target group spans instances across AZ1, AZ2, and AZ3, providing both clustering redundancy and availability zone resilience.
Azure Cluster Deployments
Azure supports two primary firewall clustering patterns:
-
North-South with GWLB: A Public Load Balancer handles inbound traffic, which is chained to a Gateway Load Balancer in a separate VNET. The firewall cluster instances sit in a backend pool, and traffic is encapsulated using VXLAN over UDP. Internal servers receive traffic after inspection.
-
East-West with ILB: Traffic between application servers (App1 and App2) flows through Internal Load Balancers on each side of the firewall cluster, providing segmented inspection for lateral traffic.
Hardware Models and Configuration Workflows for Firewall Clustering
The configuration workflow for building a firewall cluster differs significantly between hardware platforms.
4100/9300 Series Workflow
For the legacy 4100 and 9300 chassis-based platforms, the configuration is split between FXOS and FMC:
- Interface configuration is performed in FXOS
- Cluster creation is performed in FXOS
- Bootstrap configuration is applied in FXOS
- Registration of the control node is done in FMC
- Adding other nodes to the existing cluster is done in FMC
- Remaining configuration (policies, NAT, etc.) is done in FMC
3100/4200/6100 Series Workflow
The newer 3100, 4200, and 6100 platforms simplify the process by moving everything to FMC:
- Registration of all nodes is done in FMC
- Cluster creation is done in FMC
- All remaining configuration is done in FMC
This is a significant improvement over the 4100/9300 workflow, as it eliminates the need to manage configurations across two separate management planes (FXOS and FMC).
| Step | 4100/9300 | 3100/4200/6100 |
|---|---|---|
| Interface Config | FXOS | FMC |
| Cluster Creation | FXOS | FMC |
| Bootstrap Config | FXOS | N/A |
| Node Registration | FMC (control first) | FMC (all nodes) |
| Policy Config | FMC | FMC |
Split-Spanned EtherChannel for Multi-Site Deployments
For organizations with two physical sites, split-spanned EtherChannel enables a cluster to span both locations:
- Each site has its own local data EtherChannel on the switch side
- A single spanned data EtherChannel exists on the cluster side
- The CCL is extended at Layer 2 between sites
- Connections are localized to the site where they originate
- Upstream switches use vPC or VSS for local redundancy
This design ensures that traffic stays local to each site whenever possible, minimizing inter-site CCL traffic while maintaining full cluster benefits.
Clustering in ACI Multi-Pod
Firewall clustering integrates with ACI Multi-Pod environments using:
- Anycast addressing for consistent reachability
- Split-spanned EtherChannel across ACI pods
- IPN (Inter-Pod Network) connectivity for CCL traffic between pods
Firewall Clustering with VPN and Routing Protocols
Site-to-Site VPN in Clusters
VPN support in firewall clusters has evolved significantly. In versions prior to 10.0.0, site-to-site VPN tunnels were established only with the control node. This meant:
- All VPN traffic had to be processed by a single node
- The control node became a bottleneck for encrypted traffic
- If the control node failed, VPN tunnels had to be re-established
Distributed Site-to-Site VPN (Version 10.0.0+)
Starting with version 10.0.0, distributed site-to-site VPN enables VPN tunnel endpoints on all cluster nodes, not just the control node. This dramatically improves VPN throughput and eliminates the single-node bottleneck.
Distributed VPN is supported on:
- Models: 4200 and 6100
- Mode: Spanned EtherChannel, Routed
The distributed VPN traffic flow works as follows:
- A new VPN session arrives at Node A, which becomes the VPN Session Owner.
- Node A decrypts the traffic and delivers it to the destination.
- The Orchestrator (Control Node) ensures equal distribution of VPN sessions across nodes.
- Node A chooses Node B as the Backup Session Owner.
- If subsequent VPN traffic arrives at Node D (a different node), Node D becomes the Forwarder and redirects traffic to the owner (Node A).
The control node serves as the Orchestrator, ensuring that VPN sessions are evenly distributed across all cluster nodes for optimal load balancing.
Routing Protocol Peering Behavior
The behavior of routing protocol peering differs between cluster modes:
- Spanned Mode: Only the control node establishes routing protocol peering with external routers. This simplifies the routing adjacency model.
- Individual Mode: All nodes establish routing protocol peering with external routers, since each node has its own unique IP addresses.
How Do You Monitor Firewall Clustering Health?
Monitoring the health of a firewall cluster is critical for maintaining reliability. Two primary tools are available.
CLI Commands
The cluster exec command allows you to run commands across all cluster nodes simultaneously from any single node:
cluster exec show flow-offload statistics
cluster exec show interface
These commands execute on every node in the cluster and return consolidated output, making it easy to compare metrics across nodes without connecting to each one individually.
Health Monitoring Dashboard
The Health Monitoring Dashboard in FMC provides:
- Ease of troubleshooting with a centralized view
- Graphical view of events across all cluster nodes
- Health status of each node with real-time indicators
The dashboard eliminates the need to run CLI commands on each node individually, giving administrators a single pane of glass for cluster health visibility.
Pro Tip: Make it a practice to review the Health Monitoring Dashboard regularly, not just during incidents. Catching a degraded node early can prevent a cascading failure scenario during a traffic spike.
Management Interface and Data Interface Considerations
Each node in a firewall cluster has a management interface that is used to manage the device individually. An important constraint to remember:
- You cannot manage a clustered firewall via its data interface — management must go through the dedicated management interface.
This means your management network must have reachability to every cluster node's management interface independently. In large-scale deployments with many nodes, this requires careful management network design to ensure out-of-band access to all nodes.
The data interfaces are the interfaces that carry transit traffic (the traffic being inspected by the firewall). In spanned mode, these are configured as spanned EtherChannels. In individual mode, each node has its own data interfaces with unique IP addresses.
Backup and Restore Strategies for Firewall Clustering
While the reference material identifies backup and restore as a key topic area for cluster management, the fundamental principle is straightforward: because the control node replicates configuration to all data nodes, backing up the control node's configuration captures the cluster-wide policy. However, each node maintains its own local state (flow tables, connection entries), which is inherently transient and reconstructed through cluster synchronization after a restore.
Ensure that your backup procedures include:
- FMC configuration backups (which include cluster definitions)
- Individual node bootstrap configurations (especially for 4100/9300 platforms where FXOS configuration is separate)
- Documentation of CCL interface assignments and MTU settings
Frequently Asked Questions
What is the maximum throughput achievable with firewall clustering?
The maximum throughput depends on the hardware model and cluster size. A 16-node cluster of 4245 appliances achieves approximately 1.79 Tbps with AVC and IPS enabled (about 140 Gbps per node). A 4-node cluster of 6170 appliances achieves approximately 1.8 Tbps (about 570 Gbps per node). A 16-node 6100 cluster is upcoming and will further increase the ceiling.
Should I use firewall clustering in a public cloud deployment?
It depends on your requirements. Clustering behind a public cloud load balancer provides stateful failover, meaning connections survive node failures. However, standalone firewalls (not clustered) behind a load balancer may deliver better raw throughput because they avoid the ~20% cluster overhead. If your applications require session persistence, choose clustering. If they are stateless, standalone instances may perform better.
What is the difference between individual mode and spanned mode?
Individual mode assigns unique IP addresses to each node and uses Layer 3 load balancing (PBR, ECMP, or ITD). It supports routed mode only and works with FTDv, 3100, 4200, and 6100 platforms. Spanned mode uses a single shared IP address with Layer 2 EtherChannel load balancing, supports both routed and transparent modes, and works with 3100, 4200, and 6100 platforms. FTDv does not support spanned mode.
How does distributed VPN work in a firewall cluster?
Available starting with version 10.0.0, distributed site-to-site VPN allows VPN tunnel endpoints on all cluster nodes rather than only the control node. The control node acts as an Orchestrator to ensure even distribution of VPN sessions. Supported on 4200 and 6100 models in spanned EtherChannel routed mode. Each session has an owner, a backup session owner, and potential forwarders for asymmetric traffic.
What MTU should I set for the Cluster Control Link in virtual deployments?
In virtual deployments using VXLAN for the CCL, the total overhead is 154 bytes (100 bytes for cluster traffic + 54 bytes for VXLAN encapsulation). Therefore, the CCL MTU should be set to data interface MTU + 154 bytes. For example, if your data interface MTU is 1500 bytes, the CCL MTU should be at least 1654 bytes.
Which cloud providers support auto-scaling with firewall clusters?
Auto-scaling with firewall clusters is supported on AWS (version 7.6), Azure (version 7.7), and GCP (version 10.0.0). Each provider uses its native monitoring and orchestration tools: AWS uses CloudWatch and Lambda, Azure uses Azure Monitor with Functions and Logic Apps, and GCP uses Cloud Monitoring with Cloud Functions.
Conclusion
Firewall clustering is one of the most powerful technologies available for scaling network security infrastructure while maintaining high availability. From the fundamental architecture of control and data nodes, through the nuanced handling of asymmetric TCP and UDP flows, to the diverse deployment options across hardware appliances, private cloud hypervisors, and public cloud instances on AWS, Azure, and GCP — mastering this technology is essential for any security engineer working at scale.
Key takeaways from this guide:
- Clustering combines multiple firewalls into a single logical device, delivering up to 1.8 Tbps of throughput with stateful failover, at the cost of approximately 20% overhead.
- Individual mode and spanned mode serve different architectural needs — choose based on your Layer 2/Layer 3 topology and platform support.
- Public cloud deployments introduce auto-scaling capabilities and require careful consideration of the clustering vs. standalone tradeoff.
- Distributed VPN (version 10.0.0+) eliminates the control-node bottleneck for site-to-site VPN, spreading tunnel endpoints across all nodes.
- The Health Monitoring Dashboard provides critical visibility into cluster node status and should be part of your operational monitoring practice.
Whether you are designing a new data center security architecture or optimizing an existing FTD cluster deployment, the concepts and deployment patterns covered here will help you make informed decisions and build resilient, high-performance security infrastructure.
To deepen your knowledge and get hands-on practice with these technologies, explore the CCIE Security preparation resources available at NHPREP.