Citrix ADC Forced Failover Causes Issue with Cisco ACI

Citrix ADC Forced Failover Causes Issue with Cisco ACI

book

Article ID: CTX238900

calendar_today

Updated On:

Description

When a software failover is performed via “force ha failover” and the virtual IP's are moved from the former Primary to the new Primary. After a period of time (2-3 minutes), testing shows that some of the traffic no longer reaches the virtual servers on the new primary as expected. This happens because the former Primary Citrix ADC (formerly NetScaler) still sends traffic using the VIP as its source even after the failover. This may result in the VIP of the Citrix ADC to be relearned back on the Cisco ACI leaf interface connected to the former Primary instead of the new Primary. This issue with the relearning of the VIP occurs when the Citrix ADCs are connected to the Cisco ACI BD with Unicast Routing enabled.

Resolution

Citrix ADC workaround: Send GARP from new Primary by executing “send arp all” to ensure that the Cisco ACI leaf can keep the learning of Citrix ADC VIP updated.

Cisco ACI workaround: To support the software failover symptom described above, where integrated devices, such as the Citrix ADC (formerly NetScaler), may do source NAT, PBR, or passively move its endpoint, it is recommended to disable “IP Data-plane Learning” as follows

  • Enable "GARP based detection" as an EP Move Detection Mode on the "L3 Configuration" of the bridge domain. Note that this is not a default configuration and must explicitly be enabled.
  • Use Cisco ACI Service Graph with PBR(policy-based redirect) to the Citrix ADC interface. If PBR is enabled on the Citrix ADC interface, data plane learning is disabled anyways on the traffic from the Citrix ADC interface.
  • If Cisco ACI Service Graph PBR is not enabled on one of the Citrix ADC interfaces:
Note: Disabling endpoint learning at the BD level is supported and QA verified only when this option is used in conjunction with service graphs with PBR. https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739989.html

Problem Cause

By default, Cisco ACI learns the source IP address of the packets via data-plane, which is called IP data-plane learning. Because of this behavior, the following scenario can happen when the Citrix ADC is in an ACI Bridge Domain with Unicast Routing enabled.:
  1. After failover, new Primary sends out GARPs for VIP/SNIPs and the Cisco ACI leaf switch updates the MAC/IP information in its table which is called the endpoint table.
  2. When the former   primary times-out or resets old connections, TCP ACK’s and RSTs, or UDP datagrams sent with the MAC belonging to the VIP address, will unintentionally be re-learned by the wrong dataplane endpoint.
  3. Cisco ACI by default passively learns the endpoints for MAC+IP combinations and thus forwards packets destined to VIP to old primary thereby causing an outage.
  4. Cisco ACI learns the MAC + IP for every forwarded packet behaving like both a switch and a router, and it does not behave like a traditional router look just for ARP/GARP to register new endpoints.
  5. In ACI terminology, Newly Active vServer send GARP, but after that old-Active sends TCP-reset or retransmitted ACK (IP packet), and hence the End-Point moves back to old location

Issue/Introduction

Issue relates to failovers that are performed on Cisco ACI (Application Centric Infrastructure)

Additional Information

More information on Endpoint Learning is available below: 
https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739989.html#_Toc519714805