Error: "Power Supply Failure detected" erroneously reported by LOM

Error: "Power Supply Failure detected" erroneously reported by LOM

book

Article ID: CTX215590

calendar_today

Updated On:

Description

Diagnostics available from the LOM dashboard (Lights Out Management) can potentially report an erroneous power supply failure on NetScaler MPX and NetScaler SDX. When this occurs, the Sensor Readings page of the Citrix ADC (formerly NetScaler) LOM dashboard may report the following error: "Power supply failure detected"

Screenshot of the error "Power supply failure detected".

The sensor list output from the ipmitool may also report erroneous messages, such as the following:

607 | 01/28/2016 | 16:09:57 | Power Supply #0x84 | Failure detected | Asserted
608 | 01/28/2016 | 16:09:57 | Power Supply #0x85 | Failure detected | Asserted
609 | 01/28/2016 | 17:09:27 | Power Supply #0x84 | Failure detected | Deasserted
60a | 01/28/2016 | 17:09:27 | Power Supply #0x85 | Failure detected | Deasserted
60b | 01/28/2016 | 17:11:11 | Power Supply #0x84 | Failure detected | Asserted
60c | 01/28/2016 | 17:11:11 | Power Supply #0x85 | Failure detected | Asserted

On the ADC SDX Management Service VM, an erroneous error may also be reported within /var/mps/log/mps_config.log as a pair of messages similar to the following:

507 | 01/17/2018 | 13:35:53 | Power Supply PS_2 Status | Failure detected | Asserted
508 | 01/17/2018 | 13:35:53 | Power Supply PS_2 Fan Status | Failure detected | Asserted
Additionally on the ADC SDX Management Service VM, an erroneous error is reported within /var/mps/log/mps_stat.og as messages similar to the following:
Wednesday, 18 Jan 17 10:10:55.428 +0100 [Debug] Sending Message to EVENT /tmp/mps/ipc_sockets/mps_event_sock
{ "errorcode": 0, "message": "Done", "message_id": -1, "resrc_driven": true, "login_session_id":
"", "username": "***********", "tenant_name": "Owner", "mps_ip_address": "", "client_ip_address": "", "client_protocol":
"http", "client_port": 0, "mpsSessionId": "", "source": "", "target": "EVENT", "version": "v1", "messageType":
"MESSAGE_TYPE_INTERNAL", "client_type": "INTERNAL", "resourceType": "mps_internal_event", "orignal_resourceType":
"mps_internal_event", "resourceName": "", "operation": "", "asynchronous": true, "params":
{ "pageno": 0, "pagesize": 0, "detailview": true, "compression": false, "count": false, "total_count": 0, "action":
"", "type": "", "onerror": "EXIT", "is_db_driven": false, "order_by": "", "asc": false, "duration":
"", "duration_summary": 0 }, "mps_internal_event": [ { "EVENT_FORCE_SEND": "false", "EVENT_TRAP_VAL":
"", "EVENT_CATEGORY": "HealthMonitoring", "EVENT_THRESHOLD_VAL": "", "EVENT_FAILURE_OBJ":
"PS_2 Status", "EVENT_MSG": "PS_2 Status: Power Supply failure detected", "EVENT_SEVERITY":
"Critical", "EVENT_TRAP_ID": "27", "EVENT_SOURCE": "169.254.0.1" } ] } 

PS_2 Status      | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
PS_2 Fan Status  | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
PS_2 Temp Status | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
PS_2 Temp        | 41.000     | degrees C  | ok    | na        | na        | na        | na        | na        | na
In contrast, on ADC MPX or SDX, the status LED on each power supply will still display as green, indicating that the power supply is operating properly.  Further, on ADC MPX, output from the stat CLI command does not report any failures for counters used to check the status of the power supply. For example:
> stat system detail | grep -i power
Power supply 1 status                         NORMAL
Power supply 2 status                         NORMAL
Power supply 3 status                    NOT SUPPORTED
Power supply 4 status                    NOT SUPPORTED

Resolution

LOM firmware 3.39 is needed to properly address multiple power supplies for correct status monitoring.

1. Verify that the external status LED of each power supply shows as green.

2. On ADC MPX, verify output from the "stat system" command reports as NORMAL for each expected power supply.  On an example ADC MPX with two power supplies, the output is as follows:

> stat system detail | grep -i power
Power supply 1 status                         NORMAL
Power supply 2 status                         NORMAL
Power supply 3 status                    NOT SUPPORTED
Power supply 4 status                    NOT SUPPORTED

On ADC MPX with four power supplies, the output is as follows:

> stat system detail | grep -i power
Power supply 1 status                         NORMAL
Power supply 2 status                         NORMAL
Power supply 3 status                         NORMAL
Power supply 4 status                         NORMAL

3. Once the power supply status is verified, check to see if the LOM firmware version is lower/older than version 3.39 by using the ipmitool from the UNIX shell.
Note: On ADC SDX, login as "root" in order to use the ipmitool executable successfully.
Example syntax for checking the firmware version:
> ipmitool mc info | grep -i firmware
Firmware Revision         : 3.21
If the reported firmware is lower/older than 3.39, then a LOM firmware upgrade is necessary for the LOM dashboard to properly report power supply status.
 

Problem Cause

The ADC MPX and ADC SDX appliances use internal addresses to distinguish multiple power supplies for subsequent status monitoring.  The addressing methodology for multiple power supplies changed, making the update necessary.

Issue/Introduction

Diagnostics available from the LOM dashboard can potentially report an erroneous power supply failure on ADC MPX and ADC SDX.

Additional Information

For further information regarding how to upgrade the LOM firmware, please see the following:
CTX218264 - How to Upgrade the LOM Firmware on Any NetScaler MPX Platform
CTX140270 - How to Upgrade LOM Firmware on NetScaler 115xx and CloudBridge 4xxx/5xxx Model Families Using LOM Web Interface
Product Documentation (MPX) - Lights Out Management Port of the NetScaler MPX Appliance
Product Documentation (SDX) - Lights Out Management Port of the NetScaler SDX Appliance
CTX200734 - NetScaler LOM Version and Support Matrix

For further information regarding troubleshooting of a NetScaler appliance power supply, please see the following:
CTX202340 - How to Troubleshoot Power Supply Voltage Warnings on NetScaler SDX