In XenApp and XenDesktop, 99% of the images that end up on a user’s screen have been compressed (and eventually decompressed) by one means or another.
Our most recent addition to Thinwire, Selective H.264 enables us to identify regions of the screen that are rapidly changing in a video. Being a contextual based codec, H.264 delivers high quality video when bandwidth is limited.
In any particular example, Thinwire is seen to detect a region for Selective H.264 encoding (Adaptive Display v2). The use of H.264 is negotiated with the end-point Receiver, and if it is not supported, variable quality JPEGs will be used instead.
Next, Thinwire efficiently decomposes the rest of the screen and uses JPEG for photographic or complex imagery and RLE for text and simple imagery. In the latter case (that can often form a large part of what the user actually sees for day-to-day work), bitmaps will be compressed with the Citrix lossless codec, known as 2DRLE. As the name suggests, this is an implementation of the well-known and well-understood compression scheme “Run Length Encoding”.
The general idea behind RLE, when applied to two-dimensional image data i.e bitmap, is to express runs (repetitions) of the same pixel in a more efficient manner. For example, consider the following sequence of pixels:
ABABABCCCCCCCBBBBBABABAB
Assuming 8-bits (1 byte) per pixel, 24 bytes are needed to express this sequence.
The pixel “C” repeats 6 times followed by a 5-long string of “B”s. The same string of pixels can be encoded as:
ABABAB<7>C<5>BABABAB
Assuming the counts can also be stored as a byte, this version only needs 16 bytes of memory. That’s a saving of 8 bytes. With ultra-high resolutions becoming more common, the savings RLE offers can easily run into the megabytes.
There are various techniques which further improves the compression ratios achieved by the basic implementation. In order to understand how 2DRLE performs, here is an example. A quick benchmark was run on a sample set of 8,947 bitmaps (ranging from 4 x 4 pixels all the way up to 1280 x 1024) captured from a live desktop session that was destined for lossless compression, and compared the results against another popular lossless image file format, PNG:
Lossless Compression | Total uncompressed size (bytes) | Total compressed size (bytes) | Time taken (milliseconds) |
2DRLE | 360,580,256 | 14,088,798 | 7,700 |
PNG* | 360,580,256 | 16,357,146 | 20,200 |
Note: libpng was used to compress sample data.
As well as achieving a better compression ratio than PNG, 2DRLE is over 2.5x quicker. Compression speed is absolutely critical for session interactivity and performance. PNG will probably compress better on more complex photographic data, however, 2DRLE expects to see very little of this in reality.
Note: With highly-evolved 2DRLE, sometimes we may get very good compression ratios, but always at the expense of CPU or memory.
An example of an experiment conducted where 12% improvement over 2DRLE with next to no increase in CPU usage is seen.
Lossless Compression | Total uncompressed size (bytes) | Total compressed size (bytes) | Time taken (milliseconds) |
2DRLE | 360,580,256 | 14,088,798 | 7,700 |
PNG | 360,580,256 | 16,357,146 | 20,200 |
MD_COMPRESS | 360,580,256 | 12,398,102 | 7,750 |
Experiments with other publicly available lossless image formats like FLIF or Google’s WebP reveal that FLIF generally achieves a better compression ratio than MD_COMPRESS however it is 12x slower. Similarly, WebP did marginally better but again was far too slow for real-time use.
Note: In addition to end results, we pay close attention to all the resources required during compression (server) and decompression (client). This allows us to make clear, data-driven decisions, and ensures HDX does its best as the remote access protocol.