NetScaler USER Monitor Troubleshooting Aid

NetScaler USER Monitor Troubleshooting Aid

book

Article ID: CTX123661

calendar_today

Updated On:

Description

This article describes some of the troubleshooting methods used for diagnosing problems with USER based monitors on NetScaler.

User monitors should be used to provide extended verification of applications that exceed the capability of the NetScaler provided ones. To track the status of the monitored server, the monitor sends an HTTP POST request to the configured dispatcher. This POST request contains the IP address and port of the server, and the script that must be executed. The dispatcher executes the script as a child process, with user-defined parameters (if any). Then, the script sends a probe to the server. The script sends the status of the probe (response code) to the dispatcher. The dispatcher converts the response code to an HTTP response and sends it to the monitor. Based on the HTTP response, the monitor marks the service as Up or Down. The NetScaler logs the error messages to the /var/nslog/nsumond.log file when user monitor probes fail.

User-added image

Verify the Dispatcher Socket

root@GA-NS1# nstcpdump.sh -s0 -XXnn 'port 3013'

Setting 1000 pages (4000 KB) of trace buffers ... Done.
Enabling all nic trace mode=6 ... Done.
Changing trace packet length from 0 to 0 ... Done.
Saving current trace data in file 'pipe' for '3600' seconds ... in TCPDUMP format

Working Request:

11:17:12.505340 IP 127.0.0.2.3104 > 127.0.0.1.3013: P 1:221(220) ack 1 win 8190
        0x0000:  00d0 6809 d486 00d0 6809 d486 0800 4500  ..h.....h.....E.
        0x0010:  0104 d86d 0000 ff06 a29d 7f00 0002 7f00  ...m............
        0x0020:  0001 0c20 0bc5 14d0 7dd4 f72a 956f 5018  ........}..*.oP.
        0x0030:  1ffe 0000 0000 504f 5354 202f 6e74 6c6d  ......POST./ntlm
        0x0040:  5f6d 6f6e 2e70 6c20 4854 5450 2f31 2e31  _mon.pl.HTTP/1.1
        0x0050:  0d0a 4e73 6d6f 6e69 746f 722d 7265 7370  ..Nsmonitor-resp
        0x0060:  6f6e 7365 7469 6d65 6f75 743a 2032 0d0a  onsetimeout:.2..
        0x0070:  436f 6e74 656e 742d 4c65 6e67 7468 3a20  Content-Length:.
        0x0080:  3130 330d 0a48 6f73 743a 2031 3237 2e30  103..Host:.127.0
        0x0090:  2e30 2e31 0d0a 436f 6e6e 6563 7469 6f6e  .0.1..Connection
        0x00a0:  3a20 436c 6f73 650d 0a0d 0a6e 7375 6d6f  :.Close....nsumo
        0x00b0:  6e5f 6970 3d31 302e 3132 2e35 362e 3133  n_ip=10.12.56.13
        0x00c0:  3026 6e73 756d 6f6e 5f70 6f72 743d 3830  0&nsumon_port=80
        0x00d0:  266e 7375 6d6f 6e5f 6172 6773 3d64 6f6d  &nsumon_args=dom
        0x00e0:  6169 6e3d 6a6f 686e 6461 6e69 2d64 633b  ain=johndani-dc;
        0x00f0:  7573 6572 3d61 646d 696e 6973 7472 6174  user=administrat
        0x0100:  6f72 3b70 6173 7377 6f72 643d 6369 7472  or;password=citr
        0x0110:  6978                                     ix

Working Response:

11:17:12.520414 IP 127.0.0.1.3013 > 127.0.0.2.3104: P 1:139(138) ack 221 win 65535
        0x0000:  00d0 6809 d48b 00d0 6809 d486 0800 4500  ..h.....h.....E.
        0x0010:  00b2 9a38 4000 4006 a20a 7f00 0001 7f00  ...8@.@.........
        0x0020:  0002 0bc5 0c20 f72a 956f 14d0 7eb0 5018  .......*.o..~.P.
        0x0030:  ffff c8e3 0000 4854 5450 2f31 2e31 2032  ......HTTP/1.1.2
        0x0040:  3030 204f 4b0d 0a4d 6f72 652d 496e 666f  00.OK..More-Info
        0x0050:  3a20 2054 6869 7320 6973 2061 204b 4153  :..This.is.a.KAS
        0x0060:  2072 6573 756c 740d 0a53 6572 7665 723a  .result..Server:
        0x0070:  204e 6574 7363 616c 6572 2049 6e74 6572  .Netscaler.Inter
        0x0080:  6e61 6c20 4d6f 6e69 746f 7220 4469 7370  nal.Monitor.Disp
        0x0090:  6174 6368 6572 0d0a 436f 6e74 656e 742d  atcher..Content-
        0x00a0:  4c65 6e67 7468 3a20 300d 0a43 6f6e 6e65  Length:.0..Conne
        0x00b0:  6374 696f 6e3a 2043 6c6f 7365 0d0a 0d0a  ction:.Close....

Failed Request:

0000  50 4f 53 54 20 2f 6e 74 6c 6d 2e 70 6c 20 48 54   POST /ntlm.pl HT
0010  54 50 2f 31 2e 31 0d 0a 4e 73 6d 6f 6e 69 74 6f   TP/1.1..Nsmonito
0020  72 2d 72 65 73 70 6f 6e 73 65 74 69 6d 65 6f 75   r-responsetimeou
0030  74 3a 20 32 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65   t: 2..Content-Le
0040  6e 67 74 68 3a 20 31 30 33 0d 0a 48 6f 73 74 3a   ngth: 103..Host:
0050  20 31 32 37 2e 30 2e 30 2e 31 0d 0a 43 6f 6e 6e    127.0.0.1..Conn
0060  65 63 74 69 6f 6e 3a 20 43 6c 6f 73 65 0d 0a 0d   ection: Close...
0070  0a 6e 73 75 6d 6f 6e 5f 69 70 3d 31 30 2e 31 32   .nsumon_ip=10.12
0080  2e 35 36 2e 31 33 30 26 6e 73 75 6d 6f 6e 5f 70   .56.130&nsumon_p
0090  6f 72 74 3d 38 30 26 6e 73 75 6d 6f 6e 5f 61 72   ort=80&nsumon_ar
00a0  67 73 3d 64 6f 6d 61 69 6e 3d 6a 6f 68 6e 64 61   gs=domain=johnda
00b0  6e 69 2d 64 63 3b 75 73 65 72 3d 61 64 6d 69 6e   ni-dc;user=admin
00c0  69 73 74 72 61 74 6f 72 3b 70 61 73 73 77 6f 72   istrator;passwor
00d0  64 3d 63 69 74 72 69 78                           d=citrix

Failed Response:

0000  48 54 54 50 2f 31 2e 31 20 35 30 32 20 42 61 64   HTTP/1.1 502 Bad
0010  20 47 61 74 65 77 61 79 0d 0a 46 61 69 6c 75 72    Gateway..Failur
0020  65 2d 52 65 61 73 6f 6e 3a 20 54 65 72 6d 69 6e   e-Reason: Termin
0030  61 74 69 6e 67 20 63 6f 6e 6e 65 63 74 69 6f 6e   ating connection
0040  0d 0a 53 65 72 76 65 72 3a 20 4e 65 74 73 63 61   ..Server: Netsca
0050  6c 65 72 20 49 6e 74 65 72 6e 61 6c 20 4d 6f 6e   ler Internal Mon
0060  69 74 6f 72 20 44 69 73 70 61 74 63 68 65 72 0d   itor Dispatcher.
0070  0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a   .Content-Length:
0080  20 30 0d 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20    0..Connection:
0090  43 6c 6f 73 65 0d 0a 0d 0a                        Close....

Status Code Table

HTTP Response Code

Meaning

200 - success

Probe success.

503 - service unavailable

Probe failure.

404 - not found

Script not found or cannot execute.

500 - Internal server error

Internal error/resource constraints in dispatcher (out of memory, too many connections, unexpected system error, or too many processes). The service does not go down.

400 - bad request

Error parsing HTTP request.

502 - bad gateway

Error decoding script's response.

Verify the Dispatcher Process

The dispatcher is a process on the NetScaler that listens to monitoring requests. It can be on the loopback IP address (127.0.0.1) and port 3013.

root@GA-NS1# ps axl | grep 'nsumond'

    0   470     1 177  10  0   924  296 wait   Is    ??    0:00.00 /netscaler/nsumond
65532   471   470 177   2  0   924  272 kqread I     ??    0:00.00 /netscaler/nsumond

root@GA-NS1# netstat -an -f inet | grep 3013

tcp4       0      0  127.0.0.1.3013         *.*                    LISTEN

Examine the nsumond.log for Errors

root@ns# cd /var/nslog
root@ns# more /var/nslog/nsumond.log

./ntlm_mon.pl Script failed. Exit code : 1
./ntlm_mon.pl Exit Reason : Invalid ScriptArg format
./ntlm_mon.pl Script failed. Exit code : 1
./ntlm_mon.pl Script failed. Exit code : 1
./ntlm_mon.pl Exit Reason : Failed request - 401
./ntlm_mon.pl Script failed. Exit code : 1
./ntlm_mon.pl Exit Reason : Failed request - 401
./ntlm_mon.pl Script failed. Exit code : 1

Issue/Introduction

This article describes some of the troubleshooting methods used for diagnosing problems with USER based monitors on NetScaler.