The restoration of the NetScaler MAS backup fails at restoring of the Database as shown in the following screen shot:
This will stay at this point and will not continue. Checking process using top or ps commands will show low CPU usage, which indicates that the restore process has stopped.
The following process will also be in an aborted state:
bash-2.05b# ps -auxww | grep aborted mpspostgres 16442 0.0 2.2 1230068 169296 ?? Is 8:09PM 0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted)
The following steps applies to NetScaler MAS release 11.1 build 51.21. There is an expected fix for the restore of HA backups on standalone units in build 52.12 and later. If this build is available then it is highly recommended to upgrade to that version.
Ensure that the backup is of the same version as running on the standalone.
Open an SSH connection to the unit.
Upload the backup and restore as normal.
When the backup is running and is stuck at restoring the database (wait a few minutes to ensure this is the case).
From the shell check if the below process are running.
bash-2.05b# ps -auxww | grep sql mpspostgres 14657 0.0 2.2 1228020 167940 ?? S 8:01PM 0:28.26 /mps/db_pgsql/bin/postgres -D /var/mps/db_pgsql/data root 16441 0.0 0.0 6992 1780 ?? I 8:09PM 0:00.02 /mps/db_pgsql/bin/psql --single-transaction mpsdb -p 5454 -U mpspostgres -h localhost -f /var/mps/backup//restore/backup/mpsdb/mpsdb_dump.sql root 35933 0.0 0.0 9100 1428 1 S+ 12:18PM 0:00.00 grep sql The issue is partly to do with this –single-transaction option which is implemented as part of HA pair restore. When the restore fails we will likely see the below process "idle in transaction (aborted)" bash-2.05b# ps -auxww | grep aborted mpspostgres 16442 0.0 2.2 1230068 169296 ?? Is 8:09PM 0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted) (postgres) root 35941 0.0 0.0 9100 1432 1 S+ 12:18PM 0:00.00 grep aborted
At this point we want to copy the database backup to another location so we can restore this correctly later.
bash-2.05b# cp /var/mps/backup/restore/backup/mpsdb/mpsdb_dump.sql /var/tmp/
Then force kill the psql process with the –single-transaction option.
bash-2.05b# kill -9 16441
The restore should then complete to initializing NetScaler MAS:
At this point we will stop the mas process.
bash-2.05b# masd stop Stopping masd . Stopped masd .
Then we can restore the database via psql (this time without the single transaction option and from the file we copied earlier). Some errors may be shown but this is expected due to the difference between the HA node and Standalone databases.
bash-2.05b# / mps /db_pgsql/bin/ psql mpsdb -p 5454 -U mpspostgres -h localhost -f /var/tmp/mpsdb_dump.sql
Once this is complete we can then restart the masd service.
bash-2.05b# masd start Initing 10.90.144.1 (10.90.144.1) deleted 10.90.148.225 (10.90.148.225) deleted 10.90.148.238 (10.90.148.238) deleted default 10.90.144.1 done add net default: gateway 10.90.144.1 Master Deployment Starting masd. Started masd.
This will take a few minutes after which we should be able to log back in.
The issue occurs as the HA nodes in NetScaler MAS have some additional fields within the database. When the database is attempted to be restored the process fails when it cannot add these additional fields into the database.