FTD High Availability
Objective
Configure and validate FTD High Availability: set up Active/Standby failover and a basic clustering check for FTD devices. This ensures traffic continuity and horizontal scaling in production networks where uptime and stateful flow preservation are critical — for example, in a data center perimeter where an outage on one firewall must not interrupt ongoing user sessions.
Topology
(Note: this lesson references the same topology used in earlier lessons. Only new interfaces and IPs for HA are shown here.)
ASCII topology diagram (exact IPs on every interface)
+----------------+ HA Link +----------------+
| FTD-Primary |-----------------------| FTD-Secondary |
| mgmt: 192.168.1.10/24 mgmt: 192.168.1.11/24
| Gi0/0: 203.0.113.1/24 (outside) Gi0/0: 203.0.113.2/24 (outside)
| Gi0/1: 10.10.10.1/24 (inside) Gi0/1: 10.10.10.2/24 (inside)
| HA1: 10.255.255.1/30 (failover) HA1: 10.255.255.2/30 (failover)
+----------------+ +----------------+
| |
| |
Upstream Router Internal Switch
203.0.113.254 10.10.10.254
Device Table
| Device | Role |
|---|---|
| FTD-Primary (FTD1) | Active unit in Active/Standby failover |
| FTD-Secondary (FTD2) | Standby unit in Active/Standby failover |
IP Addressing
| Device | Interface | IP Address |
|---|---|---|
| FTD-Primary | mgmt | 192.168.1.10/24 |
| FTD-Secondary | mgmt | 192.168.1.11/24 |
| FTD-Primary | Gi0/0 (outside) | 203.0.113.1/24 |
| FTD-Secondary | Gi0/0 (outside) | 203.0.113.2/24 |
| FTD-Primary | Gi0/1 (inside) | 10.10.10.1/24 |
| FTD-Secondary | Gi0/1 (inside) | 10.10.10.2/24 |
| FTD-Primary | HA1 (failover) | 10.255.255.1/30 |
| FTD-Secondary | HA1 (failover) | 10.255.255.2/30 |
Quick Recap
This lesson builds on the network you've already deployed. We add an HA link between the two FTD appliances (10.255.255.0/30) and configure Active/Standby failover so the active unit handles traffic while the standby maintains synchronized state. Clustering is discussed conceptually for scaling beyond two units.
Key Concepts (theory + behavior)
- Active/Standby Failover: One unit actively forwards traffic while the standby keeps a synchronized copy of configuration and (where supported) connection state. Think of it as a hot spare that can take over with minimal disruption.
- Protocol behavior: The pair exchange heartbeats over the HA link; when the active unit stops replying or indicates failure, the standby transitions to active.
- State Synchronization: FTD (inherited from ASA heritage) replicates configuration and some flow/connection state to the standby. This preserves TCP sessions across failover for minimal session loss.
- Practical: In production, stateful synchronization is critical for VoIP calls, long-lived TCP transfers, and VPNs.
- Clustering vs. HA: Clustering distributes traffic across multiple active nodes for horizontal scaling, whereas Active/Standby focuses on redundancy. Clustering redirects packets to the connection owner for flow symmetry.
- Real-world: Use clustering when you need both redundancy and performance that grows with each added node.
- Health Monitoring: Interfaces and security services (like Snort instances) are monitored — an interface failure or health-check failure can trigger failover or make a cluster reassign connections.
- Behavior: If the health monitor detects >50% Snort instance failure, the device may failover or reduce traffic handling depending on configuration.
Step-by-step configuration
Step 1: Prepare HA link and management connectivity on both FTDs
What we are doing: Configure the management and dedicated HA interfaces so the two FTD units can communicate for synchronization and heartbeats. This physical and IP setup is essential — without reachability the failover pair cannot establish state exchange.
! On FTD-Primary
configure terminal
interface Management0/0
ip address 192.168.1.10 255.255.255.0
no shutdown
exit
interface GigabitEthernet1/2
description HA1-link-to-secondary
ip address 10.255.255.1 255.255.255.252
no shutdown
exit
write memory
exit
! On FTD-Secondary
configure terminal
interface Management0/0
ip address 192.168.1.11 255.255.255.0
no shutdown
exit
interface GigabitEthernet1/2
description HA1-link-to-primary
ip address 10.255.255.2 255.255.255.252
no shutdown
exit
write memory
exit
What just happened: The management interfaces provide out-of-band access and are used for management-plane connectivity (e.g., FMC). The dedicated HA interface provides an isolated channel for failover heartbeats and state replication. Assigning /30 ensures a point-to-point link, minimizing broadcast domains.
Real-world note: Use an isolated physical link or a dedicated VLAN for HA traffic to avoid congestion or accidental exposure of synchronization traffic.
Verify:
show ip interface brief
Expected output on FTD-Primary:
Interface IP-Address OK? Method Status Protocol
Management0/0 192.168.1.10 YES manual up up
GigabitEthernet1/2 10.255.255.1 YES manual up up
GigabitEthernet0/0 203.0.113.1 YES manual up up
GigabitEthernet0/1 10.10.10.1 YES manual up up
Expected output on FTD-Secondary:
Interface IP-Address OK? Method Status Protocol
Management0/0 192.168.1.11 YES manual up up
GigabitEthernet1/2 10.255.255.2 YES manual up up
GigabitEthernet0/0 203.0.113.2 YES manual up up
GigabitEthernet0/1 10.10.10.2 YES manual up up
Step 2: Enable and configure failover on both units
What we are doing: Turn on failover and designate primary/secondary roles. We also bind the HA (failover) interface name to the logical failover link. This establishes which unit should be active and how they communicate.
! On FTD-Primary
configure terminal
failover
failover lan unit primary
failover lan interface HA1 GigabitEthernet1/2
failover interface ip HA1 10.255.255.1 255.255.255.252 standby 10.255.255.2
write memory
exit
! On FTD-Secondary
configure terminal
failover
failover lan unit secondary
failover lan interface HA1 GigabitEthernet1/2
failover interface ip HA1 10.255.255.2 255.255.255.252 standby 10.255.255.1
write memory
exit
What just happened: The failover command enables redundancy. Declaring units primary/secondary defines expected roles (primary starts as active). The failover lan interface command ties a physical interface to a logical HA link name (HA1). The failover interface ip command assigns IPs used for failover communication and peer identification. These ensure the devices know how to reach each other for sync.
Real-world note: In production, mirror critical interfaces across both devices and use two HA links if supported for better redundancy.
Verify:
show failover
Expected output on FTD-Primary:
Failover On
Unit: Primary
This host: Active
Peer: Secondary (Standby)
Heartbeat interface: HA1 (GigabitEthernet1/2) 10.255.255.1
Stateful failover: Enabled
Configuration replication: Successful
Interface states:
GigabitEthernet0/0 (outside) - Active/Active
GigabitEthernet0/1 (inside) - Active/Active
Standby IP addresses:
HA1: 10.255.255.2
Expected output on FTD-Secondary:
Failover On
Unit: Secondary
This host: Standby
Peer: Primary (Active)
Heartbeat interface: HA1 (GigabitEthernet1/2) 10.255.255.2
Stateful failover: Enabled
Configuration replication: Successful
Interface states:
GigabitEthernet0/0 (outside) - Standby
GigabitEthernet0/1 (inside) - Standby
Active IP addresses (peer):
HA1: 10.255.255.1
Step 3: Verify state synchronization and active/standby behavior with a test flow
What we are doing: Generate a simple TCP flow through the active unit and confirm the standby has synchronized the connection state. This proves that session preservation will occur on failover.
! On a test client, initiate a long-lived TCP connection to a server behind the FTD pair.
! After connection established, simulate a failover by forcing the primary to standby.
! On FTD-Primary (force failover)
configure terminal
no failover active
exit
! Back on the test client, check that TCP session remains established or re-establishes quickly.
What just happened: Triggering failover forces the primary to relinquish active role; the standby takes over. Because stateful synchronization was enabled, the failover should preserve the TCP flow or cause minimal disruption. In practice, replication latency and the type of traffic determine exact behavior.
Real-world note: For production, test with representative traffic (VoIP, database connections) because different protocols tolerate failover differently.
Verify:
show failover
show conn detail
Expected output snippet for show conn detail on the new active unit (FTD-Secondary after failover):
TCP connection 192.0.2.50:34567 -> 10.10.10.20:443
State: ESTABLISHED
Age: 00:12:30
Owner: local
Translated: no
Protocol: TCP
Step 4: Validate health monitoring and interface tracking
What we are doing: Confirm interfaces and security services are being monitored; configure tracking if necessary. Interface health or security module failure triggers failover to preserve service availability.
! Show current interface and service health monitoring
show failover
show hw-module status
show service-policy
What just happened: These commands display monitored component states, ensuring the failover feature will react to interface down events and service failures. The platform watches both control-plane and data-plane indicators before switching.
Real-world note: Configure explicit interface tracking for critical links (e.g., upstream gateway) to ensure failover if the active unit's egress is lost.
Verify:
show failover
Expected relevant excerpt:
Interface states:
GigabitEthernet0/0 (outside) - up
GigabitEthernet0/1 (inside) - up
Tracked objects:
Upstream-gateway-track - up
Failover reason: No outstanding failures
Step 5: Discuss clustering and horizontal scale (conceptual check)
What we are doing: We will check cluster readiness and discuss how clustering differs from Active/Standby. Clustering adds multiple active nodes for scale; each node contributes capacity and must maintain flow symmetry.
! Conceptual verification commands (platform-specific cluster commands are managed in FXOS/FMC)
show cluster status
show cluster members
What just happened: These commands (illustrative) would show cluster membership, node states, and load distribution. Clustering is typically configured and managed centrally and requires planning for connection ownership and traffic redirection.
Real-world note: Use clustering in high-throughput environments where adding nodes should increase throughput and connection capacity linearly.
Verify:
show cluster status
Expected output (example):
Cluster mode: Enabled
Cluster size: 3
Local node: Member 1 (Active)
Member 2: Active
Member 3: Active
Connection owner ID: 1
Cluster health: Healthy
Verification Checklist
- Check 1: HA link is up on both units — verify with
show ip interface briefand expect HA IPs 10.255.255.1/2 up. - Check 2: Failover is enabled and roles are Primary/Secondary — verify with
show failoverand expect "Unit: Primary" on FTD-Primary and "This host: Active". - Check 3: Connection state is synchronized and survives failover — verify by creating a flow, forcing failover, then
show conn detailon new active; expect ESTABLISHED entries.
Common Mistakes
| Symptom | Cause | Fix |
|---|---|---|
| Failover state shows "standby" on both units | Both units configured as secondary or no unit designated primary | Reconfigure failover lan unit primary on intended active box and secondary on peer |
| HA link shows interface down or unreachable | Physical link down, wrong IP/subnet, or VLAN misconfig | Check physical cabling, interface admin state, and ensure both HA IPs are in the same /30 subnet |
| Connections drop after failover | Stateful synchronization not enabled or limited by platform resources | Confirm stateful failover is enabled and evaluate platform ACE/capacity constraints; consider clustering for scale |
| Configuration replication errors | Mismatched software versions or licensing issues | Ensure both units run compatible software and licensing; check logs for replication errors and reconcile versions |
Key Takeaways
- Active/Standby failover provides resilient redundancy by keeping a standby unit ready to take over with minimal downtime; use it when you need fast failover and simple redundancy.
- State synchronization is essential for preserving sessions across failover — test with real traffic types to validate behavior.
- Clustering is a different model (horizontal scaling) useful in high-throughput environments where each additional node increases capacity.
- Always isolate HA traffic on a dedicated physical link or VLAN and monitor health of both interfaces and detection services to ensure predictable failover behavior.
Tip: After any HA change, practice controlled failovers during maintenance windows to validate behavior and measure failover times.