Lesson 3 of 5

High Availability and Clustering

High Availability and Clustering

Introduction

In production networks, a single firewall is a single point of failure. If that device goes down, every connection it protects goes with it. That is why every serious security deployment relies on high availability (HA) — the ability to keep traffic flowing even when hardware or software fails.

This lesson covers the stateful HA options available on Cisco Secure Firewall (ASA and FTD platforms), ranging from a simple Active/Standby failover pair all the way to Active/Active clustering with up to 16 appliances. You will learn the roles each cluster member plays, how new connections are established across a cluster, how the Cluster Control Link keeps everything synchronized, and how to deploy a cluster through the Firewall Management Center (FMC). By the end, you will be able to choose the right HA model for a given data center design and understand the mechanics behind each option.

Key Concepts

Stateful HA Options at a Glance

Cisco Secure Firewall supports three stateful HA deployment models. The table below summarizes their key differences.

HA ModelDevicesPBR Required?IP/MAC BehaviorDescription
Single Active/Standby FailoverOne logical device with two concrete devicesNo — PBR is not mandatoryThe Active/Standby pair represents a single MAC/IP entry (e.g., 10.1.1.1)Traditional failover pair; one unit forwards traffic while the standby monitors and takes over on failure
Active/Active ClusterOne logical device with one concrete device (multi-node)Required if cluster is stretched across podsThe cluster represents a single MAC/IP entry (e.g., 10.1.1.1)Up to 16 appliances or modules combine into one traffic-processing system using Spanned EtherChannel mode
Several Active/Standby Nodes (Symmetric PBR)One logical device with multiple concrete devicesRequired — Symmetric PBR ensures each flow is handled by the same Active node in both directionsEach Active node has a unique MAC/IP entry (e.g., 10.1.1.1, 10.1.1.2, 10.1.1.3)Multiple independent Active/Standby pairs, each with its own IP, load-balanced with Symmetric PBR

Core Terminology

  • Control Node — The cluster member that synchronizes the cluster configuration to all other members. Every cluster has exactly one Control Node.
  • Data Node — Any cluster member that is not the Control Node. Data Nodes process traffic but receive their configuration from the Control Node.
  • Flow Owner — The cluster member that receives the first packet of a given flow. Ownership is nondeterministic — it depends on which unit the upstream switch or router sends the packet to.
  • Flow Director — The cluster member that keeps track of which node owns a specific flow. The Director role is deterministic, meaning it is calculated from the flow parameters so every node agrees on which member is the Director for a given connection.
  • Flow Forwarder — A cluster member that receives a packet belonging to a flow it does not own. It redirects the packet to the correct Flow Owner.
  • Cluster Control Link (CCL) — A dedicated out-of-band link used for internode communication, asymmetric traffic redirection to the Flow Owner, and state sharing across all cluster members.

Important: Cluster nodes share connection state through the CCL, but they do not share IPS state. Keep this in mind when sizing inspection workloads.

How It Works

Clustering Architecture

When you build an ASA or FTD cluster, up to 16 appliances or modules combine into a single logical traffic-processing system. The cluster preserves all the benefits of traditional failover — virtual IP and MAC addresses provide first-hop redundancy, and connection states are preserved after a single member failure — while adding true horizontal scalability.

Key architectural properties of clustering include:

  • Single management entity — All members are managed as one device.
  • Virtual IP and MAC addresses — Provide seamless first-hop redundancy for connected hosts.
  • Fully distributed data plane — New and existing connections are processed across all members.
  • Elastic scaling — Throughput and maximum concurrent connections scale as you add members.
  • Stateless external load balancing — Upstream and downstream switches distribute traffic using standard EtherChannel or routing; no proprietary load balancer is needed.
  • No data-plane member-to-member communication — All internode traffic uses the out-of-band Cluster Control Link, keeping the data interfaces clean.

New TCP Connection Flow

Understanding how a cluster handles a brand-new TCP connection is essential. The following step-by-step walkthrough describes the process when a client initiates a connection to a server through the cluster.

  1. Client sends a TCP SYN. The upstream switch delivers the SYN to one of the cluster members based on its EtherChannel hash.
  2. Receiving unit becomes the Flow Owner. That unit adds a TCP SYN Cookie to the packet and forwards it to the server on the outside interface.
  3. The Owner updates the Flow Director. The Owner notifies the deterministic Director node so the Director knows who owns this flow.
  4. Server responds with a TCP SYN-ACK. The return packet may arrive on a different cluster member because the downstream switch has its own EtherChannel hash.
  5. Receiving unit becomes a Flow Forwarder. The unit that receives the SYN-ACK inspects the TCP SYN Cookie, identifies the Owner, and redirects the packet to the Owner through the CCL.
  6. Flow Owner delivers the SYN-ACK to the client. From this point forward, any member that receives a packet for this flow and is not the Owner will redirect it through the CCL.

Key takeaway: The TCP SYN Cookie embedded by the Owner is what allows any other cluster member to identify and redirect return traffic to the correct Owner, even when packets arrive asymmetrically.

Interface Modes

Clustering supports two interface modes. In environments integrated with ACI (Application Centric Infrastructure), only Spanned EtherChannel mode is supported. In Spanned mode, a single port-channel spans all cluster members, and the upstream switches see one logical link. This is the mode used when deploying clusters across ACI Multi-Pod fabrics.

Configuration Example

Creating an FTD Cluster in FMC

Cluster creation is performed from the Firewall Management Center. The following elements are configured during cluster setup:

! Step 1 — Define the cluster
!   Cluster Name:        FTD-Cluster-01
!   Secret Key:          Lab@123
!   Topology:            Spanned EtherChannel

! Step 2 — Configure the first cluster node
!   Management IP:       (assign per your addressing plan)
!   CCL Interface:       (dedicated interface for Cluster Control Link)

! Step 3 — Add additional cluster members
!   Each additional node requires its own management IP
!   All nodes share the same cluster name and secret key

In the FMC wizard you will provide:

  • Cluster Name — A descriptive name for the cluster.
  • Secret Key — A shared secret that authenticates cluster members to each other.
  • First Cluster Node — The initial node, which becomes the Control Node.
  • CCL Information — The interface and addressing for the Cluster Control Link.
  • Additional Cluster Members — Each Data Node you want to join the cluster.

Once the cluster is deployed, the Control Node pushes its configuration to all Data Nodes. All members share a single virtual IP and MAC address for each data interface, so connected devices see one logical firewall.

PBR Deployment for Clusters in ACI Multi-Pod

When the cluster is stretched across ACI pods, Policy-Based Routing (PBR) steers traffic into the firewall. Both pods use the same PBR destination IP.

! PBR service graph pointing to the cluster virtual IP
!   Pod1 — FW PBR IP: 10.1.0.1
!   Pod2 — FW PBR IP: 10.1.0.1
!
! Both pods share the same PBR destination because the cluster
! presents a single virtual IP across the Spanned Port-Channel.

Because the cluster uses a Spanned Port-Channel across pods, the PBR service graph in each pod references the same firewall IP address (10.1.0.1). ACI delivers traffic to the local cluster members, and the CCL handles any asymmetric redirection between pods.

Best practice: When stretching a cluster across ACI Multi-Pod, keep the Cluster Control Link latency under 50 ms — the maximum supported for inter-pod communication in virtual metro cluster designs.

Real-World Application

Multi-Pod Data Center Deployments

The most compelling use case for clustering is the active-active data center design. Organizations that run ACI Multi-Pod fabrics across two sites (Site A and Site B) can stretch a single FTD cluster across both pods. This gives them:

  • Predictable traffic flow — Firewall localization to a single pod keeps most traffic local.
  • Seamless failover — If an entire pod loses its cluster members, the surviving pod continues to process all flows because connection state is synchronized across the CCL.
  • Cross-cluster connection state synchronization — Failover within and between pods is seamless because every node shares connection state.

The inter-pod network supports up to 50 ms latency, which accommodates metro-distance separation between sites.

Choosing the Right HA Model

RequirementRecommended Model
Simple redundancy, two devicesSingle Active/Standby Failover
Horizontal scale-out, single virtual IPActive/Active Cluster (Spanned EtherChannel)
Multiple independent pairs, per-node IPsSeveral Active/Standby Nodes with Symmetric PBR
Stretched across ACI podsActive/Active Cluster with PBR service graph

Design Considerations

  • Cluster Control Link sizing — The CCL carries redirected data packets and state synchronization. Size it appropriately for your expected asymmetric traffic volume.
  • IPS state is not shared — Each node performs its own inspection independently. Keep this in mind when evaluating threat detection consistency across members.
  • Spanned EtherChannel is required for ACI — If you are deploying in an ACI fabric, individual interface mode is not supported; plan for Spanned mode from the start.
  • Symmetric PBR for multi-pair designs — When deploying multiple Active/Standby pairs, Symmetric PBR ensures that both directions of a flow traverse the same Active node, preventing dropped connections.

Summary

  • Clustering combines up to 16 ASA or FTD appliances into a single logical firewall, providing both high availability and elastic throughput scaling.
  • Every cluster has one Control Node (configuration leader) and one or more Data Nodes; traffic flows use the roles of Owner, Director, and Forwarder to handle asymmetry.
  • The Cluster Control Link is the backbone of the cluster — it carries redirected traffic, state synchronization, and internode communication, but data interfaces never carry member-to-member cluster traffic.
  • Three stateful HA models exist: single Active/Standby failover, Active/Active clustering with a shared virtual IP, and multiple Active/Standby nodes with Symmetric PBR.
  • For ACI Multi-Pod deployments, a single FTD cluster stretched across pods with Spanned EtherChannel and PBR provides seamless failover and predictable traffic flow with up to 50 ms inter-pod latency.

Next up: Continue with the next lesson in this course to explore firewall policy design and rule optimization on Cisco Secure Firewall.