Lossless Compression: Lowering the Cost of Pixel Perfection

Lossless Compression: Lowering the Cost of Pixel Perfection

book

Article ID: CTX218314

calendar_today

Updated On:

Description

In XenApp and XenDesktop, 99% of the images that end up on a user’s screen have been compressed (and eventually decompressed) by one means or another. 

Our most recent addition to Thinwire, Selective H.264 enables us to identify regions of the screen that are rapidly changing in a video. Being a contextual based codec, H.264 delivers high quality video when bandwidth is limited.

In any particular example, Thinwire is seen to detect a region for Selective H.264 encoding (Adaptive Display v2). The use of H.264 is negotiated with the end-point Receiver, and if it is not supported, variable quality JPEGs will be used instead.

Next, Thinwire efficiently decomposes the rest of the screen and uses JPEG for photographic or complex imagery and RLE for text and simple imagery. In the latter case (that can often form a large part of what the user actually sees for day-to-day work), bitmaps will be compressed with the Citrix lossless codec, known as 2DRLE. As the name suggests, this is an implementation of the well-known and well-understood compression scheme “Run Length Encoding”.

Run Length Encoding (RLE)

The general idea behind RLE, when applied to two-dimensional image data i.e bitmap, is to express runs (repetitions) of the same pixel in a more efficient manner. For example, consider the following sequence of pixels:
ABABABCCCCCCCBBBBBABABAB

Assuming 8-bits (1 byte) per pixel, 24 bytes are needed to express this sequence.

The pixel “C” repeats 6 times followed by a 5-long string of “B”s. The same string of pixels can be encoded as:

ABABAB<7>C<5>BABABAB

Assuming the counts can also be stored as a byte, this version only needs 16 bytes of memory. That’s a saving of 8 bytes. With ultra-high resolutions becoming more common, the savings RLE offers can easily run into the megabytes.

A simple implementation of RLE

There are various techniques which further improves the compression ratios achieved by the basic implementation. In order to understand how 2DRLE performs, here is an example. A quick benchmark was run on a sample set of 8,947 bitmaps (ranging from 4 x 4 pixels all the way up to 1280 x 1024) captured from a live desktop session that was destined for lossless compression, and compared the results against another popular lossless image file format, PNG:

Lossless CompressionTotal uncompressed size (bytes)Total compressed size (bytes)Time taken (milliseconds)
2DRLE360,580,25614,088,7987,700
PNG*360,580,25616,357,14620,200

Note: libpng was used to compress sample data.

As well as achieving a better compression ratio than PNG, 2DRLE is over 2.5x quicker. Compression speed is absolutely critical for session interactivity and performance. PNG will probably compress better on more complex photographic data, however, 2DRLE expects to see very little of this in reality.

Note: With highly-evolved 2DRLE, sometimes we may get very good compression ratios, but always at the expense of CPU or memory.

An example of an experiment conducted where 12% improvement over 2DRLE with next to no increase in CPU usage is seen.

Lossless CompressionTotal uncompressed size (bytes)Total compressed size (bytes)Time taken (milliseconds)
2DRLE360,580,25614,088,7987,700
PNG360,580,25616,357,14620,200
MD_COMPRESS360,580,25612,398,1027,750

Experiments with other publicly available lossless image formats like FLIF or Google’s WebP reveal that FLIF generally achieves a better compression ratio than MD_COMPRESS however it is 12x slower. Similarly, WebP did marginally better but again was far too slow for real-time use.

Note: In addition to end results, we pay close attention to all the resources required during compression (server) and decompression (client). This allows us to make clear, data-driven decisions, and ensures HDX does its best as the remote access protocol.

Issue/Introduction

image compression,hdx.