After installing the Citrix NetScaler VPX for KVM on Nutanix Acropolis Hypervisor (AHV), when connecting to the NetScaler VPX NSIP either by HTTP or SSH from the Guest VM on the same AHV host, connection will fail. However, if connection is from a Guest VM on a different AHV host, there is no such issue.
The TCP/UDP Checksum Offload (IPv4) function enables the adapter port to compute the checksum of transmitting IPv4 packets and verify the checksum of receiving IPv4 packets, taking load off from the CPU, this is the default behaviour for AHV.In NetScaler, presently we do not offload the checksum, NetScaler does it inside. NetScaler will work with other VMs, the only prerequisite being those other VMs does not use checksum offloading feature.
The workaround for this situation is to turn off all Tx Checksums between the VMs on the same Host. It is not the Rx Checksums.
1. NetScaler's packet engine does checksum verification, which cannot be turned off.
2. When virtio-net is configured to receive checksum offloading, the AHV will pass in packets with "partial" checksum if the source of the packet is another VM on the same host. This is common practice to save CPU cycles since there is no reason to do checksums if the packet never hits the wire. The assumption is also that if you asked the virtual nic to do rxcsum, then you would not do it again in software, so it is OK to deliver a packet with "partial" checksum.
3. In order for NetScaler to work properly, it turns off receive checksum offload, then does its own checksum verification. When rxcsum is turned off from a VM, then AHV must finish the checksum process and "partial" checksum becomes "complete".
4. However, virtio-net device does not support turning off rxcsum. It is always forced to be on, so NetScaler is fundamentally incompatible with virtio-net
This issue is not seen when the client machine is on same subnet but different host (example - Windows machine).
- Machine on same subnet as that of NSIP on same host (example - Linux) is not able to reach the NSIP.
- Took trace from same client.
- Client on same host as that of NetScaler (Non-working set up).
- Migrated the same client machine to different host (Working set up).