Citrix ADC Networking and VLAN Best Practices

Citrix ADC Networking and VLAN Best Practices

book

Article ID: CTX214033

calendar_today

Updated On:

Description

The ADC uses VLANs to determine which interface should be used for which traffic. In addition, ADC does not participate in Spanning Tree. Without the proper VLAN configuration, the ADC is unable to determine which interface to use and it can function more like a HUB than a switch or router in these instances (it will try to use ALL interfaces for each conversation).  

Symptoms of VLAN Misconfiguration

[Know about the Best Practices for VLAN and Network Configurations ]

This type of issue can manifest itself in many forms, including performance issues, inability to establish connections, randomly disconnected sessions, and in severe situations, network disruptions seemingly unrelated to the ADC itself.  The ADC may also report MAC Moves, muted interfaces, and/or management interface transmit or receive buffer overflows, depending on the exact nature of the interaction with your network : 

MAC Moves
Muted interfaces
Interface buffer overflows
Orphan ACKs
High rates of retransmissions / retransmit giveups
High Availability Split Brain
Promiscuous VLAN Tag Drops


MAC Moves : (counter nic_tot_bdg_mac_moved) This indicates that the ADC is using more than one interface to communicate with the same device (MAC address), because it could not properly determine which interface to use.

Muted interfaces : (counter nic_err_bdg_muted)  This indicates that the ADC has detected that it is creating a routing loop due to VLAN configuration issues, and as such, it has shut down one or more of the offending interfaces in order to prevent a network outage.

Interface buffer overflows, typically referring to management interfaces : (counter nic_err_tx_overflow) This can be caused if too much traffic is being transmitted over a management interface. Management interfaces on the ADC is not designed to handle large volumes of traffic, which may result from network and VLAN misconfigurations triggering the ADC to use a management interface for production data traffic. This often occurs because the ADC has no way to differentiate traffic on the VLAN / subnet of the NSIP (NSVLAN) from regular production traffic. It is highly recommended that the NSIP be on a separate VLAN and subnet from any production devices such as workstations and servers.

Orphan ACKs (counter tcp_err_orphan_ack) : This counter indicates that the ADC received an ACK packet that it was not expecting, typically on a different interface than the ACK'd traffic originated from. This situation can be caused by VLAN misconfigurations where the ADC transmits on a different interface than the target device would typically use to communicate with the ADC (often seen in conjunction with MAC moves)

High rates of retransmissions / retransmit giveups : (counters: tcp_err_retransmit_giveups, tcp_err_7th_retransmit, various other retransmit counters) The ADC will attempt to retransmit a TCP packet a total of 7 times before it gives up and terminates the connection. While this situation can be caused by network conditions, it often occurs as a result of VLAN and interface misconfiguration.

High Availability Split Brain : Split Brain is a condition where both HA nodes believe they are Primary, leading to duplicate IP addresses and loss of ADC functionality. This is caused when the two HA nodes cannot communicate with each-other using HA Heartbeats on UDP Port 3003 using the NSIP, across any interface. This is typically caused by VLAN misconfigurations where the native VLAN on the ADC interfaces do not have connectivity between ADC.
Promiscuous VLAN Tag Drops (counter nic_err_vlan_promisc_tag_drops) : The network should be configured to never deliver packets tagged on VLANs that the ADC does not require.  As the ADC processes all packets in software (vs hardware like on a network switch), there is a performance impact caused by having to examine inbound packets that the ADC does not require, in order to drop them.  It is common practice to configure a network switch with a range of VLANs to deliver to a specific port (ie, "allowed vlan 1-100").  This can cause significant traffic to be delivered to the ADC that it does not require.  Instead, specify a specific set of vlans at the switch to avoid this (ie, "allowed vlan 1,2,5,10,20").
 

Best Practices for VLAN and Network Configurations

  1. Each subnet should be associated with a VLAN.

  2. More than one subnet can be associated with the same VLAN (depending on your network design).

  3. EACH VLAN SHOULD BE ASSOCIATED TO ONLY ONE INTERFACE (for purposes of this discussion, a LA Channel counts as a single interface).

  4. If you require more than one subnet to be associated with an interface, the VLAN must be tagged.

  5. Contrary to popular belief, the Mac-Based-Forwarding (MBF) feature on the ADC is not designed to mitigate this type of issue. MBF is designed primarily for the DSR (Direct Server Return) mode of the ADC, which is rarely used in most environments (it is designed to allow traffic to purposely bypass the ADC on the return path from the backend servers). MBF may hide VLAN issues in some instances, but it should not be relied-upon to resolve this type of problem.

  6. Every interface on ADC requires a native VLAN (unlike Cisco, where native VLANs are optional), although the TagAll setting on an interface can be used so that no untagged traffic will leave the interface in question.

  7. The native VLAN can be tagged if necessary for your network design (this is the TagAll option for the interface).

  8. The VLAN for the subnet of your ADC's NSIP is a special case. This is called the NSVLAN. The concepts are the same but the commands to configure it are different and changes to the NSVLAN require a reboot of the NetScaler to take effect.  If you attempt to bind a VLAN to a SNIP that shares he same subnet as the NSIP, you will get “Operation not permitted.” This is because you have to use the NSVLAN commands instead. See CTX123172 for details.  Also, on some firmware versions, you cannot set an NSVLAN if that VLAN number already exists via the “add VLAN” command. Simply remove the VLAN and then set the NSVLAN again.

  9. HA Heartbeats always use the Native VLAN of the respective interface (optionally tagged if the TagAll option is set on the interface).

  10. There must be communication between at least one set of Native VLAN(s) on the two nodes of an HA pair (this can be direct or via a router). The native VLANs are used for HA heartbeats. If the ADC cannot communicate between native VLANs on any interface, this will lead to HA failovers and possibly a split-brain situation where both ADC think they are primary (leading to duplicate IP addresses, amongst other things).

  11. The ADC does not participate in spanning tree.  As such, it is not possible to use spanning tree to provide for interface redundancy when using a ADC. Instead, use a form of Link Aggregation (LACP or manual LAG) for this purpose.
    Note: If you wish to have link aggregation between multiple physical switches, you must have the switches configured as a virtual switch, using a feature such as Cisco's Switch Stack. 

  12. The HA Synchronization and Command Propagation, by default, uses the NSIP/NSVLAN. To separate these out to a different VLAN, you can use the SyncVLAN option of the set HA node command.

  13. There is nothing built-in to the ADC's default configuration that denotes that a management interface (0/1 or 0/2) is restricted to management traffic only. This must be enforced by the enduser through VLAN configuration. The Management interfaces are not designed to handle data traffic, so your network design should take this into account. Management interfaces, contained on the ADC motherboard, lack various offloading features such as CRC offload, larger packet buffers, and other optimizations, making them much less efficient in handling large amounts of traffic.  In order to separate production data and management traffic, the NSIP should not be on the same subnet/VLAN as your data traffic.

  14. If it is desired to use a management interface to carry management traffic, it is best practice that the Default Route be on a subnet other than the subnet of the NSIP (NSVLAN). In many configurations, the default route will be relied-upon for workstation commmunications (in an internet scenario). If the default route is on the same subnet as the NSIP this will lead to such traffic using the management interface, which can cause the interface to be overloaded.

  15. Additionally to #10-on an SDX-the SVM, XenServer, and all ADC instance NSIP's should be on the same VLAN and subnet. There is no "backplane" inside of the SDX that allows for communication between SVM/Xen/Instances. If they are not on the same VLAN/subnet/interface, traffic between them must leave the physical hardware, be routed on your network, and return. This can lead to obvious connectivity issues between the instances and SVM and as such, is not recommended. A common symptom of this is a Yellow Instance State indicator in the SVM for the VPX instance in question and the inability to use the SVM to reconfigure a VPX instance.

  16. In the event that some VLANs are bound to subnets and some are not, during an HA failover, GARP packets will not be sent for any IP addresses on any of the subnets that are not bound to a VLAN.  This can cause dropped connections and connectivity issues during HA failovers, as the ADC cannot notify the network of the change of MAC ownership of IP addresses on non-VMAC-configured ADC. Symptoms of this are that during/after a HA failover, the ip_tot_floating_ip_err counter increments on the former primary ADC for more than a few seconds, indicating that the network did not receive or process GARP packets and the network is continuing to transmit data to the new secondary ADC.

Issue/Introduction

This article contains ADC networking and VLAN best practices.

Additional Information

CTX236843 - Comparison Between NetScaler and Cisco VLAN Types
CTX136926 - How to Associate an IP Subnet with a NetScaler Interface by Using VLANs
CTX122921 - Citrix NetScaler Interface Tagging and Flow of High Availability Packets
CTX115575 - FAQ: The "trunk" or "tagall" Option of NetScaler Appliance
CTX123172 - NetScaler nsvlan Command
CTX115504 - How to Configure and Verify Link Aggregation Control Protocol (LACP) on NetScaler Appliance
CTX134962 - Link Aggregation on a NetScaler SDX Appliance
CTX226652 - Basic Design Guidelines and Principles on NetScaler Routing, Default Routes, Interfaces and Channels, VLANs, and GARP