In the event of a XenServer host power failure, any Virtual Machines (VMs) running on that host might not be displayed in XenCenter. This is the expected behavior without High Availability (HA) enabled.
The following is a XenCenter screen shot of the pool prior to the failure of the host named "xenserver2":
This is a screen shot of the same pool after the failure of xenserver2:
Note: VMs that were running on xenserver2 are not displayed in XenCenter.
To recover from this issue, complete the following procedure:
Confirm if the host has actually failed and determine if it can be recovered.
In the event of certain hardware failures there is no way to recover the host and it must be removed from the pool altogether.
If the host is not recoverable, run the following command to obtain a list of VMs that are running on the failed host:
xe vm-list resident-on=<UUID_of_failed_host> is-control-domain=false params=uuid
Run the following command to reset the power-state on those VMs:
xe vm-reset-powerstate uuid=<UUID_of_the_VM_to_recover> --force
a) To reset the power-state for all the VMs which got locked on the failed slave server , run the following command:
xe vm-reset-powerstate resident-on= <UUID_of_failed_host> --multiple --force
After all the VMs are recovered, the failed host can be forgotten.
Before you can start the VMs on another XenServer host, you must release the "locks" on the VM storage.
Each disk in a Storage Repository can only be used by one host at a time. So it is essential to make the disk accessible to other XenServer hosts after a host has failed.
To do so, run the following script on the pool master for each SR that contains disks of any affected VMs:
/opt/xensource/sm/resetvdis.py all <UUID_of_failed_host> <UUID_of_SR> [--master]
Note: Customers must only supply the third string ("--master") only in the following cases:
. When the SR is not shared (i.e. local storage).
. When the SR is a shared SR and the failed host is the pool master
Warning! Incorrect use of this command can lead to data corruption. Before running the preceding command for an SR ensure the following conditions are true:
. The failed XenServer host is unrecoverable
. The SR is not attached
. The VDIs on the SR are not in use
If you attempt to start a VM on another XenServer host before running this command, you might receive the following error message:
VDI <UUID> already attached RW.
This is the expected behavior without High Availability (HA) enabled. When HA is not enabled XenServer is unable to confirm that there has been a host failure and to recover the VMs from that host. It is unsafe to restart the VMs on other hosts in the pool if the problem host is unresponsive rather than in a completely failed state. This might cause data corruption.
CTX119717 – XenServer High Availability
CTX130821 – How to Clean the Xapi Database after Running the host-forget Command