Understanding Garbage Collection and Coalesce Process Troubleshooting

Understanding Garbage Collection and Coalesce Process Troubleshooting

book

Article ID: CTX201296

calendar_today

Updated On:

Description

Summary

In most cases, customers have experienced issues with coalescing snapshots after deleting snapshots.  At times we do not see the reclaimed space once a snapshot is deleted. Even with an SR rescan to manually get the Garbage Collection (GC) to kick-in does not reclaim space.

This article discusses the post-snapshot deletion process, potential bottlenecks in coalesce, and how to identify and address coalesce related issues in XenServer environment.

Before we begin, let us see what VDIs are included in a Snapshot.

Active VDI – As the name suggest it is the VDI which holds the current writes.  This VDI is set to read-write to hold the current writes.

Base Copy – The deflated size of your changes before snapshot.  This VDI is set to read-only.   If the active VDI needs data to be read prior snapshot it will read from the base copy.

Snapshot Metadata – This holds the snapshot metadata such as the time the snapshot was taken, the parent, etc. Default size is 8MB for LVM based SRs and few KBs in File based SRs.

Background

Garbage Collect thread (GC) is the process of reclaiming space by deleting VDI’s which are no longer required or unreferenced and that would have caused the environment run out of storage. A VDI is not Garbage Collected if a reference exists. The main objective of GC is to identify all VDIs which do not have any references and reclaim their used space back to free space.

GC starts in case of an SR scan, VDI activity like deletion, or snapshot delete etc. 

When a VDI is deleted, the VDI is first detached and VHD metadata is updated. Then the garbage collector runs in the background and is responsible for actually removing the VHD file or volume.  The GC reclaims space by deleting unused VDIs. GC looks for VDIs for which there is no VDI metadata file and deletes them.

If a VDI is deleted in case of a snapshot delete, the copy-on-write VDI (Also known as Active VDI) in the snapshot chain get coalesced or merged to its parent.  This is also handled by the Garbage collection process for the SR.

There are two types of coalesce; inline or non-leaf coalesce which takes care of coalescing subsequent chains leaving a parent and a active snapshot. The second is leaf coalesce which takes care of coalescing the active snapshots to the parent leaving a single VDI.

Inline or non-leaf coalesce is guaranteed to be safe and does not require any active tapdisk participation in the copying of blocks on Xenserver.  There is no need to pause the disk activity since everything happens in background with the disks, which are read-only.

In most cases, the active VM’s snapshot coalesce failing rather than the offline coalesce which normally succeeds.  When the VM is active and needs to be coalesced during that time, we need to hold the current writes to the VM.  This is handled by Single-Snapshotting the VM. 

The Single-Snapshotting will continue until the delta between the leaf and parent VHD is less than 20MB. If delta is more than 20M, it does this by single-snapshotting and deleting the snapshot. When the loop exists, a leaf coalesce (reducing to a single disk) is performed in any case. The factors that create the loop are, when the VM is I/O intensive and when the time frequency increases the default threshold of pausing tapdisk process.  The default is 10 seconds.

Commands

To find out which VMs have snapshots:

#for x in `xe snapshot-list params=all |grep children |cut -d : -f 2 `; do xe vm-list uuid=$x; done

To check the snapshot Chain:

#vhd-util scan –m “VHD-*” -f -c -l VG_XenStorage-<SR-UUID> -p –v

To find the number of chains in a snapshot:

For LVHD:

#vhd-util query -vsfd -p -n /dev/VG_XenStorage-<sr_uuid>/VHD-<vdi_uuid>

For File VHD, like NFS/ext3:

#vhd-util query -vsfd -p -n /var/run/sr-mount/<sr_uuid>/<vdi_uuid>.vhd

If there is no parent for the VDI, then there is no snapshot.

Note: Replace the UUID with your respective VDI-UUID and SR-UUI.

Troubleshooting Issues

Most of the coalescing issues are reported clearly in the /var/log/SMlog, please refer the following cases and how they are fixed.

Problem 1:

If the Exception fails for a VHD corruption,

Example:

<1511> 2014-05-16 09:48:33.237614       *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*

<1511> 2014-05-16 09:48:33.237694                ***********************

<1511> 2014-05-16 09:48:33.237766                *  E X C E P T I O N  *

<1511> 2014-05-16 09:48:33.237836                ***********************

<1511> 2014-05-16 09:48:33.237921       coalesce: EXCEPTION util.SMException, VHD *437da53e[VHD](900.000G//99.184G|a) corrupted

To get the corrupted VHD fixed:

#vhd-util repair –n /path-to-vhd

To check if the repair worked:

#vhd-util check –n /path-to-vhd

Do a SR rescan to get the GC start again.

Problem 2:

Coalesce due to Timed-out:

Example:

Nov 16 23:25:14 vm6 SMGC: [15312] raise util.SMException("Timed out")

Nov 16 23:25:14 vm6 SMGC: [15312]

Nov 16 23:25:14 vm6 SMGC: [15312] *

Nov 16 23:25:14 vm6 SMGC: [15312] *** UNDO LEAF-COALESCE

This is when the VDI is currently under significant IO stress. If possible, take the VM offline and do an offline coalesce or do the coalesce when the VM has less load. Upcoming versions of XenServer will address this issue more efficiently.

Problem 3:

Error: "The Snapshot Chain is too Long"

Check the /var/log/SMlog file. If SMGC is failing, the snapshot chain continues to grow even after deleting snapshots.
It might be required to repair or remove corrupted VHDs. Please open a support case for further investigation.

Example:
Jan  8 20:06:49 xen SMGC: [15859] gc: EXCEPTION <class 'util.SMException'>, Parent VDI

8de22437-4344-4a49-9b2e-8ee98ce94ab3 of ecc33936-8624-4a73-b266-7c1212866cf2 not found

Problem 4:

SR_BACKEND_FAILURE_44 insufficient space:

The process of taking snapshots requires additional overhead on your SR. So you need sufficient room to perform the operation. For a running VM with a single snapshot to get coalesce you need twice the space in case of LVM SR’s (Active VDI + Single Snapshotting VDI). If we are in short of space in the SR, we get the following error.

Either do an offline coalesce or increase the SR size to accommodate online coalescing.

Problem 5:

In rare cases, with third party backup tools we have seen the Base copy VDI to be corrupted and needs to be repaired to complete the coalesce.  Remember that base copy is always set to read-only mode and the LV is in offline mode.  To repair the base copy VDI corruption, change the permission and bring online, repair, and revert back the changes.  Here are the steps:

  1. Check if the volume is active (#lvscan |grep <vhd-uuid>

  2. Check if the permission is read-write (#lvdisplay |grep <vhd-uuid> -A10

  3. If required activate the VDI (#lvchange –ay /dev/VG_XenStorage-<SR-UUID>/VHD-<VDI-UUID>   (in case of ext/nfs the path changes ie /var/run/sr-mount/sr-uuid/vhd-uuid)

  4. If required set the permission to read-write (#lvchange –p rw /dev/VG_XenStorage-<SR-UUID>/VHD-<VDI-UUID>   (in case of ext/nfs the path changes ie /var/run/sr-mount/sr-uuid/vhd-uuid)

  5. Repair the VHD (#vhd-util repair –n /dev/VG_XenStorage-<SR-UUID>/VHD-<VDI-UUID>

  6. Return the volume to original state (lvchange –an <path to vdi> and lvchange –p r <path to vdi>

After completing, rescan the SR and let the GC do the coalesce. 

Problem 6:

GC/Coalesce not working if the host in the pool is in maintenance mode

Bring the XenServer out of maintenance mode and then do a SR rescan for GC to start.

Problem 7:

In cases where you don’t want the leaf coalesce running, as this might impact your storage performance.  You can set it off for that particular Storage Repository using the following command. By doing this your inline coalesce will be running but leaf-coalesce (bringing all into a single VDI) will not happen.

#xe sr-param-set uuid=<SR-UUID> other-config:leaf-coalesce=false

Turn it on when you want the leaf-coalesce for that SR.

#xe sr-param-remove uuid=<SR-UUID> param-name=other-config param-key=leaf-coalesce

Problem 8:

Even if offline-coalescing in case of LVM SR, free space would be required for coalescing depending on the coalesced size. If your LVM SR doesn't reclaim free space after deleting snapshot, please check /var/log/SMlog. If a message such as "No space to coalesce" was written like below, please prepare free disk space larger than Coalesced size.
Example:
Nov  5 18:22:10 xs62sp1 SMGC: [32262] Coalesced size = 14.992G
Nov  5 18:22:10 xs62sp1 SMGC: [32262] No space to coalesce *e9667dbc[VHD](24.000G//6.328G|n) (free space: 109051904)

 

Do’s and Don’t’s

Delete snapshots that are not required.  A VM running with snapshots will have performance impact. An excessive number of delta files in a chain (caused by an excessive number of snapshots) or large delta files may cause decreased virtual machine and host performance.  Never use snapshots as backups. 

If there are VM’s with a long pending coalesce, you can try doing an offline coalesce if possible rather than doing a online which would take more time or loop depending on the delta changes.

VM backup under the hood uses Snapshot, so when the backup is finished, the snapshot should also be deleted.  It is good to check the snapshotted VDI get coalesced and this will help future coalesce of VDIs in the next backup.

Delta files can grow to the same size as the original base disk file, which is why the provisioned storage size of a virtual machine increases by an amount up to the original size of the virtual machine multiplied by the number of snapshots on the virtual machine.

Regularly monitor systems configured for backups to ensure that no snapshots remain active for extensive periods of time.

More Information

We are aware of the impact of coalesce issues and are constantly working to improve XenServer with each updated release and plan to address the issues in the future.

Issue/Introduction

Understanding Garbage Collection and Coalesce Process Troubleshooting.