Restoration of HA Backups on Standalone NetScaler MAS Fails at Restoring of the Database

book

Article ID: CTX220968

calendar_today

Updated On:

Description

The restoration of the NetScaler MAS backup fails at restoring of the Database as shown in the following screen shot:

User-added image

This will stay at this point and will not continue. Checking process using top or ps commands will show low CPU usage, which indicates that the restore process has stopped.

The following process will also be in an aborted state:

bash-2.05b# ps -auxww | grep aborted
mpspostgres 16442  0.0  2.2 1230068 169296  ??  Is    8:09PM   0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted)

Resolution

The following steps applies to NetScaler MAS release 11.1 build 51.21. There is an expected fix for the restore of HA backups on standalone units in build 52.12 and later. If this build is available then it is highly recommended to upgrade to that version.

Ensure that the backup is of the same version as running on the standalone.
Open an SSH connection to the unit.
Upload the backup and restore as normal.
When the backup is running and is stuck at restoring the database (wait a few minutes to ensure this is the case).

From the shell check if the below process are running.

bash-2.05b# ps -auxww | grep sql
mpspostgres 14657  0.0  2.2 1228020 167940  ??  S     8:01PM   0:28.26 /mps/db_pgsql/bin/postgres -D /var/mps/db_pgsql/data
root        16441  0.0  0.0  6992  1780  ??  I     8:09PM   0:00.02 /mps/db_pgsql/bin/psql --single-transaction mpsdb -p 5454 -U mpspostgres -h localhost -f /var/mps/backup//restore/backup/mpsdb/mpsdb_dump.sql
root        35933  0.0  0.0  9100  1428   1  S+   12:18PM   0:00.00 grep sql
 
The issue is partly to do with this –single-transaction option which is implemented as part of HA pair restore.
When the restore fails we will likely see the below process "idle in transaction (aborted)"
bash-2.05b# ps -auxww | grep aborted
mpspostgres 16442  0.0  2.2 1230068 169296  ??  Is    8:09PM   0:00.02 postgres: mpspostgres mpsdb 127.0.0.1(44182) idle in transaction (aborted) (postgres)
root        35941  0.0  0.0  9100  1432   1  S+   12:18PM   0:00.00 grep aborted

At this point we want to copy the database backup to another location so we can restore this correctly later.
```
bash-2.05b# cp /var/mps/backup/restore/backup/mpsdb/mpsdb_dump.sql /var/tmp/
```
Then force kill the psql process with the –single-transaction option.
```
bash-2.05b# kill -9 16441
```
The restore should then complete to initializing NetScaler MAS:

At this point we will stop the mas process.

bash-2.05b# masd stop
Stopping masd .
Stopped masd .

Then we can restore the database via psql (this time without the single transaction option and from the file we copied earlier). Some errors may be shown but this is expected due to the difference between the HA node and Standalone databases.
```
bash-2.05b# / mps /db_pgsql/bin/ psql mpsdb -p 5454 -U mpspostgres -h localhost -f /var/tmp/mpsdb_dump.sql                                               
```

Once this is complete we can then restart the masd service.

bash-2.05b# masd start
Initing
10.90.144.1 (10.90.144.1) deleted
10.90.148.225 (10.90.148.225) deleted
10.90.148.238 (10.90.148.238) deleted
default              10.90.144.1          done
add net default: gateway 10.90.144.1
 
Master Deployment
Starting masd.
Started masd.

This will take a few minutes after which we should be able to log back in.

Problem Cause

The issue occurs as the HA nodes in NetScaler MAS have some additional fields within the database. When the database is attempted to be restored the process fails when it cannot add these additional fields into the database.

Issue/Introduction

The restoration of HA backups on standalone NetScaler MAS fails at restoring of the database.

Was this article helpful?

thumb_up Yes

thumb_down No

Welcome to "KB Articles"