Target Device running on VMware going into hung state with Event - Too many retries Initiate reconnect

Target Device running on VMware going into hung state with Event - Too many retries Initiate reconnect

book

Article ID: CTX272248

calendar_today

Updated On:

Description

Please consider below scenarios and symptoms:

  • VMware 6.7 hosts the Target devices.
  • VMXNet3 is attached to the Target devices.
  • Intermittently the Target Devices go into a hung state.
  • Review of the Wireshark trace shows that the Target Device is constantly reconnecting with the Citrix Provisioning Server and the Server is responding properly, but the connection doesn't proceed any further to the IO operation.
  • Review of the Event logs on the Target Device at the time of the issue show below events triggered by BNISTACK in the System event logs:
Event ID 155 [IosReconnectHA] HA Reconnect in progress
Event ID 84 [MIoWorkerThread] Too many retries Initiate reconnect
  • ​​​​The Citrix Provisioning Server doesn't have any resource crunch.
 

Environment

Citrix is not responsible for and does not endorse or accept any responsibility for the contents or your use of these third party Web sites. Citrix is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement by Citrix of the linked Web site. It is your responsibility to take precautions to ensure that whatever Web site you use is free of viruses or other harmful items.

Resolution

Since the issue is occurring due to Buffer overflow, the same needs to be increased for VMXNet2 adapter and from with the Operating System. Please refer to below article from VM Ware:

The output of esxtop show dropped receive packets at the virtual switch (1010071)

Engage VMware support if any further assistance is needed.

 

Problem Cause

This issue is observed due to the overflow of the Ring Buffer. To check if the over flow of the ring buffer is happening, please follow the steps below:

1. Logon to the ESXi host and start ESXTOP.
2. Now press "n" for network and 't' to order them.
3. Note down the nine digit Port Number and the Internal vSwitch name, Like - 33554464 and DvsPortset-0.
4. Then run the vsish internal debugging shell to get the statistics report from the VMXNet3 adapter:

[root@esx0:~] vsish -e get /net/portsets/vSwitch0/ports/33554464/vmxnet3/rxSummary
stats of a vmxnet3 vNIC rx queue {
 LRO pkts rx ok:50314577
 LRO bytes rx ok:1670451542658
 pkts rx ok:50714621
 bytes rx ok:1670920359206
 unicast pkts rx ok:50714426
 unicast bytes rx ok:1670920332742
 multicast pkts rx ok:0
 multicast bytes rx ok:0
 broadcast pkts rx ok:195
 broadcast bytes rx ok:26464
 running out of buffers:10370
 pkts receive error:0
 # of times the 1st ring is full:7086
 # of times the 2nd ring is full:3284

 fail to map a rx buffer:0
 request to page in a buffer:0
 # of times rx queue is stopped:0
 failed when copying into the guest buffer:0
 # of pkts dropped due to large hdrs:0
 # of pkts dropped due to max number of SG limits:0
}

Above values indicate that the ring buffer is over flowing.

Additional Information

The output of esxtop show dropped receive packets at the virtual switch (1010071)
Using esxtop to Troubleshoot Performance Problems
VMXNET3 RX Ring Buffer Exhaustion and Packet Loss