HighAvailability Failover: Heartbeats are getting missed between the NetScalers on Azure.

HighAvailability Failover: Heartbeats are getting missed between the NetScalers on Azure.

book

Article ID: CTX231147

calendar_today

Updated On:

Description

/var/nslog]$  
 
The following counters netio_tot_called and sys_cur_duration_sincestart are showing the delta value of more than 7 secs. Which confirms that the CPU has become idle.

reltime:mili second between two records Wed Oct 25 23:50:22 2017
  Index   rtime totalcount-val      delta rate/sec symbol-name&device-no&time
      0  397633     1936103865      31460     2663 netio_tot_called  Wed Oct 25 23:50:22 2017
      1   12020     1936135120      31255     2600 netio_tot_called  Wed Oct 25 23:50:34 2017
      2   12067     1936165171      30051     2490 netio_tot_called  Wed Oct 25 23:50:46 2017
      3   12102     1936194232      29061     2401 netio_tot_called  Wed Oct 25 23:50:58 2017
      4   12039     1936225768      31536     2619 netio_tot_called  Wed Oct 25 23:51:10 2017
      5   10828     1936250161      24393     2252 netio_tot_called  Wed Oct 25 23:51:21 2017
      6   11357     1936277112      26951     2373 netio_tot_called  Wed Oct 25 23:51:33 2017 >>>>> Here we can see the time difference of around 12 secs. By default it should  be 7 secs.
      7   11805     1936304710      27598     2337 netio_tot_called  Wed Oct 25 23:51:44 2017
 
 
reltime:mili second between two records Wed Oct 25 23:50:22 2017
Index   rtime totalcount-val      delta rate/sec symbol-name&device-no&time
      0  397633    19.01:10:11         13        1 sys_cur_duration_sincestart  Wed Oct 25 23:50:22 2017
      1   12020    19.01:10:23         12        0 sys_cur_duration_sincestart  Wed Oct 25 23:50:34 2017
      2   12067    19.01:10:35         12        0 sys_cur_duration_sincestart  Wed Oct 25 23:50:46 2017
      3   12102    19.01:10:47         12        0 sys_cur_duration_sincestart  Wed Oct 25 23:50:58 2017      
      4   12039    19.01:11:00         13        1 sys_cur_duration_sincestart  Wed Oct 25 23:51:10 2017   >>>>>>> Here we can see the delta value is showing more than 7 secs. Thus it is confirmed that there was a CPU Halt. Hence, the heartbeat packets were not processed during the CPU idle time and the HA failover has triggered.                                                 

                                                                                                   

Resolution

Upgrade to 12.0 build 56.20 and execute the below command to prevent the CPU yielding.

set ns vpxparam -cpuyield NO

Problem Cause

CPU is not dedicated to NetScalers from Azure. Hence, when the CPU is not utilized on NetScaler, NetScaler will yield those CPUs to Azure. In the meantime, when the NetScaler wants to send the Heartbeats, it couldn't process the Heartbeat packets due to the unvailability of CPUs. Thus the Heartbeats are getting missed between the NetScalers and ends with HA failover.

Issue/Introduction

This article shows one case with HA failoveron Azure environment