XenMobile Server is in recovery mode "application failed to start"

XenMobile Server is in recovery mode "application failed to start"

book

Article ID: CTX219246

calendar_today

Updated On:

Description

Could be observed in multiple scenarios

  1. Unable to access one of the node in the cluster.
  2. Server went into recovery mode while upgrading or applying patch 
  3. Server went into recovery mode when the database is not accessible 
  4. SQL server ran out of space for log drive
Node in recovery mode and you may see the following error message "application failed to start".
User-added image
After rebooting thrice the server will go into recovery mode and you will see the error "system is in recovery mode. Please log in to console"
User-added image

Resolution

There can be multiple reasons for the server going into recovery mode while upgrading the server, applying rolling patch or simply rebooting 

If the server goes in recovery mode while configuring cluster please ensure you follow the following process:

1.) Shut down and Clone the Primary VM
2.) By default the New VM will have the same configuration as primary including the IP address
3.) Power On the cloned node 
4.) Change the IP address on the Secondary server by accessing CLI 
5.) Once done, bring up the primary server and validate the cluster settings from the CLI

The issue can be determined from the log snippet as well ,If you see the following error message 
+ OUTPUT='{"rc":8,"message":"update failed: SQL exception","details":"script:v10.10_v10.11\nline: 6\nquery: \n-- CXM-22815 Required Apps DB Changes\nALTER TABLE \"ENROLLMENT\" ADD \"DEVICE_ENROLLMENT_STATUS\" VARCHAR(64) DEFAULT  '\''NOT_STARTED'\'' CHECK (\"DEVICE_ENROLLMENT_STATUS\" IN('\''MDM_ENROLL_DONE'\'', '\''MAM_ENROLL_DONE'\'', '\''XME_ENROLL_DONE'\'', '\''NOT_STARTED'\''));\n\nSQL exception:Cannot find the object \"ENROLLMENT\" because it does not exist or you do not have permissions.\nSQL state:S1000\nSQL error code:1088"}'

This happens when the SA account configured on XenMobile doesn't have enough permissions you can use an admin or the account must have db creator permission. 



You can verify the following logs snippets 
 | FATAL | localhost-startStop-1 | com.sparus.nps.spring.DBPropertyManager | Error while loading properties configuration into database.
java.sql.SQLException: Unknown server host name 'XXXXXXXXXXX'.
at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:427)
at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:184)
| ERROR | localhost-startStop-1 | org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/] | Exception sending context initialized event to listener instance of class org.springframework.web.context.ContextLoaderListener
org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'localInstance.plainHttpAddress' defined in class path resource [awareness.xml]: Cannot load configuration properties from database (driver: "net.sourceforge.jtds.jdbc.Driver", url: "jdbc:jtds:sqlserver://xxxx.xxxxxxxx.com:1433/XenMobile10;ssl=off;selectMethod=cursor;", login: "sa"): 
java.lang.Exception: Stack trace
 | ERROR | localhost-startStop-1 | com.citrix.cg.util.ImagCipher | Exception occured..null
 | ERROR | localhost-startStop-1 | com.citrix.cg.bo.GenericCertStoreMgr | Exception occured while reading key keystore from database: null
 | ERROR | localhost-startStop-1 | org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/] | Exception sending context initialized event to listener instance of class com.apere.int500.util.OCAInitListener
java.lang.NullPointerException

This happens when XenMobile is Unable to resolve the FQDN of SQL given in XMS server- hence ensure that the DNS is able to resolve the FQDN of SQL server or you can use IP address 


You can also verify the following steps to troubleshoot this issue:

1.      Ensure that the hypervisor network adapter used is the correct one for communication to database.

2.      Specify the database server settings again, reboot the XMS node to ensure that the XenMobile Server can communicate to database.

3.      Make sure the service account of the SQL Server to be used on XMS is not expired, locked out, and has sufficient permissions on the database (at least DBcreator role, see
https://docs.citrix.com/en-us/xenmobile/server/system-requirements.html).

4.      If the above didn’t help, look in the SQL Server event viewer, and SQL Server logs to check for any connection attempts against the database.the sql server must have enough drive space 

5.      You can also check the XMS support bundle from the CLI for additional inputs.

6.      Add the below values for every node from the cluster:
*Click on settings>>Server Properties>>Click add button>>>Choose Custom Key Template>>
Enter in the following parameters
  • Key: hibernate.c3p0.timeout Value: 100
  • Display Name hibernate.c3p0.timeoute=100
  • Description: DB connections to SQL
  • Key: hibernate.c3p0.idle_test_period Value: 30
  • Display Name hibernate.c3p0.idle_test_period=30
  • Description: DB connections to SQL


The most efficient way to recover environment is to update and clone the current working XMS server and clone it after you added the above parameters. Please perform the following: :
1. Please back up the SQL Database and take snapshots of all nodes in configuration.
2. Shut down node in recovery mode
3. Add parameters to main node, shutdown
4. Clone the XMS VM. Once clone process is completed, power up the virtual appliance. Log onto the virtual appliance and make relevant network changes (modify IP address), commit, reboot to apply
5. After the VM reboots, logon and verify cluster and hazelcast cluster status(s) and add the above parameters.
6. Once completed, power on the initial XMS VM.
7. After we have validated that all nodes are correctly joined and functional, we can discard the original cluster members and delete associated snapshots.
 

Problem Cause

This issue could be caused by one of the reasons listed below:
  • The configuration on XenMobile server or database is not reachable 
  • The DNS FQDN is not resolvable 
  • Server node has become unresponsive and having high CPU utilization the next line which we see it repeatedly over and over suggests so:

Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'sessionFactory' defined in class path resource [ew-dao.xml]: Invocation of init method failed; nested exception is org.hibernate.cache.CacheException: com.hazelcast.config.InvalidConfigurationException: Premature end of file.
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1482)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:521)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
 

Issue/Introduction

The article describes the cause and steps of troubleshooting if XenMobile server goes into recovery mode. XenMobile Server is in recovery mode "application failed to start" or "System is in Recovery Mode.Please Log in to Console"