Restoration of HA Backups on Standalone NetScaler MAS Fails at Restoring of the Database

Restoration of HA Backups on Standalone NetScaler MAS Fails at Restoring of the Database

book

Article ID: CTX220968

calendar_today

Updated On:

Description

The restoration of the NetScaler MAS  backup fails at restoring of the Database as shown in the following screen shot:

User-added image

This will stay at this point and will not continue. Checking process using top or ps commands will show low CPU usage, which indicates that the restore process has stopped. 

The following process will also be in an aborted state:

bash-2.05b# ps -auxww | grep aborted
mpspostgres 16442  0.0  2.2 1230068 169296  ??  Is    8:09PM   0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted)

Resolution

The following steps applies to NetScaler MAS release 11.1 build 51.21. There is an expected fix for the restore of HA backups on standalone units in build 52.12 and later. If this build is available then it is highly recommended to upgrade to that version. 

  1.  Ensure that the backup is of the same version as running on the standalone.

  2.  Open an SSH connection to the unit.

  3.  Upload the backup and restore as normal.

  4.   When the backup is running and is stuck at restoring the database (wait a few minutes to ensure this is the case).

    User-added image

  5. From the shell check if the below process are running.

    bash-2.05b# ps -auxww | grep sql
    mpspostgres 14657  0.0  2.2 1228020 167940  ??  S     8:01PM   0:28.26 /mps/db_pgsql/bin/postgres -D /var/mps/db_pgsql/data
    root        16441  0.0  0.0  6992  1780  ??  I     8:09PM   0:00.02 /mps/db_pgsql/bin/psql --single-transaction mpsdb -p 5454 -U mpspostgres -h localhost -f /var/mps/backup//restore/backup/mpsdb/mpsdb_dump.sql
    root        35933  0.0  0.0  9100  1428   1  S+   12:18PM   0:00.00 grep sql
     
    The issue is partly to do with this –single-transaction option which is implemented as part of HA pair restore.
    When the restore fails we will likely see the below process "idle in transaction (aborted)"
    bash-2.05b# ps -auxww | grep aborted
    mpspostgres 16442  0.0  2.2 1230068 169296  ??  Is    8:09PM   0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted) (postgres)
    root        35941  0.0  0.0  9100  1432   1  S+   12:18PM   0:00.00 grep aborted
  6. At this point we want to copy the database backup to another location so we can restore this correctly later.

    bash-2.05b# cp /var/mps/backup/restore/backup/mpsdb/mpsdb_dump.sql /var/tmp/
  7. Then force kill the psql process with the –single-transaction option.

    bash-2.05b# kill -9 16441
  8. The restore should then complete to initializing NetScaler MAS:

    NetScaler MAS Restore Continued

  9. At this point we will stop the mas process.

    bash-2.05b# masd stop
    Stopping masd .
    Stopped masd .
  10. Then we can restore the database via psql (this time without the single transaction option and from the file we copied earlier). Some errors may be shown but this is expected due to the difference between the HA node and Standalone databases.

    bash-2.05b# / mps /db_pgsql/bin/ psql mpsdb -p 5454 -U mpspostgres -h localhost -f /var/tmp/mpsdb_dump.sql                                               
  11. Once this is complete we can then restart the masd service.

    bash-2.05b# masd start
    Initing
    10.90.144.1 (10.90.144.1) deleted
    10.90.148.225 (10.90.148.225) deleted
    10.90.148.238 (10.90.148.238) deleted
    default              10.90.144.1          done
    add net default: gateway 10.90.144.1
     
    Master Deployment
    Starting masd.
    Started masd.
  12. This will take a few minutes after which we should be able to log back in.

    NetScaler MAS Restore Completed


Problem Cause

The issue occurs as the HA nodes in NetScaler MAS have some additional fields within the database. When the database is attempted to be restored the process fails when it cannot add these additional fields into the database. 

Issue/Introduction

The restoration of HA backups on standalone NetScaler MAS fails at restoring of the database.