Error:nsmap[1384]: Working socket got closed with the error: Connection reset by peer
book
Article ID: CTX209480
calendar_today
Updated On:
Description
Issue:
Customer is running Netscaler build 10.1 132.x and he observing the following entry in ns.log continously. If he removes the location file he doesnt see the error for some time.Please note there are no errors reported for nsmap with the location entry.I would like to know the reason for the same. CPE team has recommended OB for moving forward with the investigation.
Environment:
show hardware
Platform: NSMPX-11500 12*CPU+8*IX+4*E1K+2*E1K+2*CVM N3 1400210
Manufactured on: 12/17/2014
CPU: 2400MHZ
Host Id: 872841350
Serial no: D1DH52EZ3U
Encoded serial no: D1DH52EZ3U
Configuration in place:
add locationFile "/var/netscaler/locdb/GeoIPCountryWhois.csv" -format geoip-country
add cs policy mapi-CN-switch -rule "CLIENT.IP.SRC.MATCHES_LOCATION(\"*.CN.*.*.*.*\") || CLIENT.IP.SRC.MATCHES_LOCATION(\"*.TW.*.*.*.*\")"
add cs policy www-CN-switch -rule "CLIENT.IP.SRC.MATCHES_LOCATION(\"*.CN.*.*.*.*\") || CLIENT.IP.SRC.MATCHES_LOCATION(\"*.TW.*.*.*.*\")"
ns.log:Dec 25 10:17:35 <local0.info> L41 nsmap[1384]: Working socket got closed with the error: Connection reset by peer
ns.log:Dec 25 10:26:59 <local0.info> L41 nsmap[1384]: Connection to PPE 4 closed on timeout. Idle time: 622909816
Temporary work around:
------------------------------
1. rm locationfile -> remove GeoIPCountryWhois.csv on the device -> upload new version csv file. ->add locationfile
2. test the location file (nsmap -d -t / sh location para)
3. reboot the mpx
Works for a few days without error and error comes back after that
ns.log:Dec 25 10:17:35 <local0.info> L41 nsmap[1384]: Working socket got closed with the error: Connection reset by peer
ns.log:Dec 25 10:26:59 <local0.info> L41 nsmap[1384]: Connection to PPE 4 closed on timeout. Idle time: 622909816
ns.log:Dec 25 10:55:59 <local0.info> L41 nsmap[1384]: Connection to PPE 2 closed on timeout. Idle time: 637019591
ns.log:Dec 25 10:57:59 <local0.info> L41 nsmap[1384]: Connection to PPE 5 closed on timeout. Idle time: 645399764
ns.log:Dec 25 11:26:20 <local0.info> L41 nsmap[1384]: Working socket got closed with the error: Connection reset by peer
ns.log:Dec 25 11:45:59 <local0.info> L41 nsmap[1384]: Connection to PPE 5 closed on timeout. Idle time: 617648437
ns.log:Dec 25 11:56:59 <local0.info> L41 nsmap[1384]: Connection to PPE 0 closed on timeout. Idle time: 639441617
ns.log:Dec 25 12:04:01 <local0.info> L41 nsmap[1384]: Working socket got closed with the error: Connection reset by peer
ns.log:Dec 25 12:09:59 <local0.info> L41 nsmap[1384]: Connection to PPE 0 closed on timeout. Idle time: 612231470
Resolution
Even though as per design enchancement request has been filed to fix the issue.
ENH0629590 NSMAP: Implement keep alive mechanism between NSMAP to PPE communication
Problem Cause
As per current design, PE does not close connection once request is served. This connection can be reused for subsequent search requests. If there is an idle time for 10 minutes, connection is closed by PCB zombie cleanup routine. In this case, it is only PPE-3 making most of the NSMAP requests. On all other PEs, frequency is very low. Here, one point to note is we have a ownership for location entries and only owner make request to NSMAP. Non-owner PEs requests need to go via owner PE. Ownership is decided based on the IP hash. For that reason, we suspect that most of the requests are coming from one IP range which has got ownership on PE3. CUSTOMER IS USING 7 PEs MPX BOX. NetScaler is receiving traffic on all PEs(pcb_hits counter will give the supporting proof). Only PE3 is receiving all DB request. There might be two reasons for PE3 to receive all the DB request.
- All the sourceIP/ClientIP falls is hashed to PE3
- The request is landing on PE3.
We belive all the client traffic are from the same IP range. reltime:mili second between two records Mon Dec 28 00:01:44 2015 Index rtime totalcount-val delta rate/sec symbol-name&device-no&time 378 0 166054 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:44 2015 (PE-3) 379 0 166048 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:44 2015 (PE-4) 380 0 166239 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:44 2015 (PE-5) 381 0 165744 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:44 2015 (PE-6) 382 0 1118374 7 1 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:44 2015 (Aggr) 383 0 434367 1 0 pcb_hits cs_pol(mapi-CN-switch)(tmon-mapi-80) Mon Dec 28 00:01:44 2015 (Aggr) 384 0 30314986 158 22 gslb_tot_sp_db_req_search Mon Dec 28 00:01:44 2015 (Aggr) 385 7000 122385 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-0) 386 0 165614 4 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-1) 387 0 166297 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-2) 388 0 30304784 136 19 gslb_tot_sp_db_req_search Mon Dec 28 00:01:51 2015 (PE-3) 389 0 166056 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-3) 390 0 166051 3 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-4) 391 0 64547 2 0 pcb_hits cs_pol(mapi-CN-switch)(tmon-mapi-80) Mon Dec 28 00:01:51 2015 (PE-4) 392 0 166240 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-5) 393 0 165746 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (PE-6) 394 0 1118389 15 2 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:51 2015 (Aggr) 395 0 30315122 136 19 gslb_tot_sp_db_req_search Mon Dec 28 00:01:51 2015 (Aggr) 396 0 434369 2 0 pcb_hits cs_pol(mapi-CN-switch)(tmon-mapi-80) Mon Dec 28 00:01:51 2015 (Aggr) 397 7000 122387 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-0) 398 0 165615 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-1) 399 0 166299 2 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-2) 400 0 64576 1 0 pcb_hits cs_pol(mapi-CN-switch)(tmon-mapi-80) Mon Dec 28 00:01:58 2015 (PE-2) 401 0 30304946 162 23 gslb_tot_sp_db_req_search Mon Dec 28 00:01:58 2015 (PE-3) 402 0 166057 1 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-3) 403 0 166054 3 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-4) 404 0 166243 3 0 pcb_hits cs_pol(www-CN-switch)(tmon-www-80) Mon Dec 28 00:01:58 2015 (PE-5) |
|
Was this article helpful?
thumb_up
Yes
thumb_down
No