AES-GCM Cipher Causes Memory Leak on NetScaler VPX Devices

AES-GCM Cipher Causes Memory Leak on NetScaler VPX Devices

book

Article ID: CTX217918

calendar_today

Updated On:

Description

AES-GCM ciphers causes memory leak on NetScaler VPX devices. The VIP that has this Cipher group bound goes down due to high memory usage.

$nsconmsg -K newnslog.154 -g mem_err -d statswt0 

Displaying current counter value excluding counters with 0 value information 
NetScaler V20 Performance Data 
NetScaler NS11.0: Build 67.12.nc, Date: Jul 13 2016, 18:59:25 

reltime:mili second between two records Thu Oct 6 14:17:22 2016 
Index reltime counter-value symbol-name&device-no 

1 0 2931 mem_err_alloc_failed MEM_TBUF 
3 0 2931 mem_err_alloc_failed mempool_SYSTEM(SYSTEM) 
5 0 6146752512 mem_err_alloc_failed_size MEM_TBUF 
7 0 6146752512 mem_err_alloc_failed_size mempool_SYSTEM(SYSTEM) 
9 0 2931 mem_err_allocfailed 
11 0 6146752512 mem_err_allocfailsize 
13 0 2931 mem_err_bm1024pages_allocfailed 
Done. 

$nsconmsg -K newnslog.154 -d oldconmsg -sConMEM=1 | grep -i tbuf | tail -5 

MEM_TBUF 4294967295 2292187136(53.37% 0.00%) 0 3668 0 
MEM_TBUF 4294967295 2292187136(53.37% 0.00%) 0 3671 0 

$ nsconmsg -K newnslog.153 -d memstats | grep -i tbuf 
MEM_TBUF 4294967295 2155872256(50.20% 0.00%) 0 0 0 

$ nsconmsg -K newnslog.154 -d memstats | grep -i tbuf 
MEM_TBUF 4294967295 2292187136(53.37% 0.00%) 0 2931 0 

As we can see above, the MEM_TBUF usage is high. Logs indicate a gradual increase in this. Every handshake containing the AES-GCM cipher, causes the MEM_TBUF to increase.

Resolution

This particular issue has already been reported in the past and the fix for this issue is now available in the firmware NetScaler 11.0 build 68.12 (and above) and NetScaler 11.1 build 49 and above. 

This is also mentioned in the release notes of NetScaler 11.0 build 68.12 - https://www.citrix.com/content/dam/citrix/en_us/documents/downloads/netscaler-adc/NS_11_0_68_12.html 

The following is an excerpt from the release notes:
NetScaler virtual appliance sometimes fails because of a memory leak if you use GCM-based ciphers on a VPX platform. The ciphers can eventually exhaust memory, causing the appliance to fail if the memory exhaustion error is not gracefully handled.
[# 652477, 654559, 656035, 657343]

To workaround the issue, unbind all cipher groups with GCM ciphers.

For example:

> show run | grep GCM

bind ssl cipher gcm-cipher -cipherName TLS1.2-ECDHE-RSA-AES256-GCM-SHA384

bind ssl cipher gcm-cipher  -cipherName TLS1.2-ECDHE-RSA-AES128-GCM-SHA256

bind ssl cipher gcm-cipher -cipherName TLS1.2-DHE-RSA-AES256-GCM-SHA384

bind ssl cipher gcm-cipher -cipherName TLS1.2-DHE-RSA-AES128-GCM-SHA256

If you see any of the above ciphers, please do not use to avoid memory issues. If you want to use above ciphers then please upgrade to the latest 11.1 build.

Issue/Introduction

AES-GCM cipher causes memory leak on NetScaler VPX devices. The VIP that has this Cipher group bound goes down due to high memory usage.

Additional Information

Few things to keep in mind regarding memory leak issues:

  • If you suspect a memory leak on your NetScaler (steady increase to high memory, ~60% utilization), the immediate workaround is to force an ha failover to the Secondary appliance.
  • The previous Primary appliance with the affected memory leak should still have HIGH memory.  The leaked memory does not free even if the NetScaler is in Secondary mode.
  • The only way to free the 'leaked' memory is to reboot the affected appliance. Ensure to save configuration before rebooting.