Citrix

Understanding Heterogeneous CPU Pooling in XenServer

  • CTX127059
  • Created onMar 26, 2014
  • Updated onDec 12, 2014
Article Topic Performance

Information

This article explains how the new Heterogeneous CPU Pooling feature in XenServer works and how to leverage it to extend your XenServer host pooling capabilities. 

Background

To ensure successful live virtual machine migrations, XenServer 5.5 and earlier hosts were only allowed to join a pool if they had identical CPU vendor, model number, family, and feature flag values. However, most system vendors add and discontinue CPU offerings within the life cycle of a server model, making it difficult to purchase servers with identical CPUs over time.
XenServer (starting with version 5.6) contains two changes to simplify adding hosts to pools over time:

  • When joining a host to a pool, only the features exposed by the CPU are considered to determine CPU compatibility.

  • Added support for Intel (FlexMigration) and AMD (Extended Migration) technologies that provide CPU "masking" or "leveling".

These CPU masking features allow a CPU to be configured to appear as providing different features than it actually does, enabling CPU models with different features to appear identical.
This combination allows disparate host hardware to be joined into a resource pool, known as heterogeneous resource pools. 

Heterogeneous Pool Types

There are four types of heterogeneous pools:

  • Adding more capable host hardware to a less capable pool

  • Adding less capable host hardware to a more capable pool

  • Combining different and mutually exclusive host hardware into a pool

  • Combining different CPU models that have identical features

Type 1 requires applying a CPU mask on the joining server to make it compatible with the existing pool, which remains unmasked, and is supported automatically using XenCenter. Type 2 requires applying a CPU mask on existing pool hosts to make them compatible with the joining host. Type 3 requires applying a common CPU mask on both the joining host and existing pool hosts. Types 2 and 3 are supported using XenAPI and the xe CLI. Type 4 represents CPUs with different marketing model names but identical model number, family, and feature flag attributes. Because they have identical attributes, these combinations have always been supported and do not require a mask, but their compatibility has not been obvious when comparing the marketing model names.

An additional nuance is that newer and older CPUs do not always translate to more capable and less capable. New CPU models often discontinue features that are present in older CPUs, which can result in mutually exclusive feature sets. As a result, it is not guaranteed that applying a mask to a host with a newer CPU is sufficient to join it to a pool containing older CPU models.
In all cases, Citrix supports only the CPU combinations listed in the XenServer Hardware Compatibility List for use with the heterogeneous pools features.

The heterogeneous pool types that require applying a CPU mask to hosts in an existing pool also imply that any existing virtual machines in the pool must be shut down until all hosts in the pool have the new CPU configuration in effect. Use of a rolling approach where virtual machines are consolidated through migration while hosts are rebooted in turn cannot be used because virtual machine migration is not supported across hosts with disparate CPU configurations.

Host CPU compatibility in XenServer, effective with version 5.6 and later

The attributes of a XenServer 5.6+ host CPU can be viewed using the xe host-cpu-info CLI command. Output from that command running on hosts with Intel E5502 and X3353 CPUs looks like: 

[host_a] # xe host-cpu-info
cpu_count                : 4
                   vendor: GenuineIntel
                    speed: 1866.734
                modelname: Intel(R) Xeon(R) CPU           E5502  @ 1.87GHz
                   family: 6
                    model: 26
                 stepping: 5
                    flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni vmx est ssse3 sse4_1 sse4_2 popcnt
                 features: 009ce3bd-bfebfbff-00000001-28100800
    features_after_reboot: 009ce3bd-bfebfbff-00000001-28100800
        physical_features: 009ce3bd-bfebfbff-00000001-28100800
                 maskable: full
[host_b] # xe host-cpu-info
cpu_count                : 4
                   vendor: GenuineIntel
                    speed: 2666.668
                modelname: Intel(R) Xeon(R) CPU           X3353  @ 2.66GHz
                   family: 6
                    model: 23
                 stepping: 6
                    flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht constant_tsc pni vmx est ssse3 sse4_1
                 features: 000ce3bd-bfebfbff-00000001-20000800
    features_after_reboot: 000ce3bd-bfebfbff-00000001-20000800
        physical_features: 000ce3bd-bfebfbff-00000001-20000800
                 maskable: base

In XenServer 5.5 and earlier, hosts were only allowed to join a pool if they had identical vendor, model number, family, and feature flag values¹. With heterogeneous pool support in XenServer 5.6, only the set of features exposed by the joining host's CPU must be identical to the pool master ², allowing the use of CPU masking features to configure identical sets of features.

Note:
¹ In XenServer 5.5, the “est” flag was ignored to ensure compatibility with XenServer 5.0.
² See "The pool.other-config:cpuid_feature_mask setting".

Using the preceding examples, the Intel E5502 CPU supports masking and has a superset of the features supported by the Intel X3353 CPU, or type 1. Run the following command on the host with the E5502 to apply a mask equivalent to the X3353:
xe host-set-cpu-features features=000ce3bd-bfebfbff-00000001-20000800 uuid=<host_uuid>

After restarting the host with the masked E5502 CPU, it's CPU configuration has a features value identical to the X3353, allowing the E5502 to successfully join a pool of X3353 hosts. After a mask is set, the CPU's un-masked features are retained in the physical features parameter:
# xe host-cpu-info

cpu_count                : 4
                   vendor: GenuineIntel
                    speed: 1866.734
                modelname: Intel(R) Xeon(R) CPU           E5502  @ 1.87GHz
                   family: 6
                    model: 26
                 stepping: 5
                    flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni vmx est ssse3 sse4_1 sse4_2 popcnt
                 features: 000ce3bd-bfebfbff-00000001-20000800
    features_after_reboot: 000ce3bd-bfebfbff-00000001-20000800
        physical_features: 009ce3bd-bfebfbff-00000001-28100800
                 maskable: full 

Determining CPU Compatibility

There are several methods to determine CPU compatibility.

  • Use the XenServer HCL. If you do not have a host that contains the CPU in question, such as when considering purchase of a new host to add to an existing pool, use the XenServer HCL (Hardware Compatibility List) to verify if the CPU model in the host being considered for purchase has been certified as compatible with the CPU model in the pool master host.

  • Attempt to join the host using XenCenter. If you already have the host to be added, attempt the pool join using XenCenter. When XenCenter detects that a joining host's CPU has a different features value than the pool master, it evaluates the two values to determine if type 1 applies and CPU masking can be used. If the following conditions are true, XenCenter offers to automatically calculate and apply the appropriate mask on the joining host. If the following conditions do not apply, the pool join fails with a "This server's hardware is incompatible with the master's" message:

  • The existing pool and joining host must both have an Advanced, Enterprise or Platinum license

  • The joining host's CPU has FlexMigration or Extended Migration support

  • The CPU vendor (Intel/AMD) of the joining host is the same as the pool master

  • The features of the joining host's CPU are a super-set of the pool master host's CPU feature

  • Use the compare-cpu script. The compare-cpu script (included in the Heterogeneous CPU Pool self- test kit here) uses the output of the xe host-cpu-info command from the joining host and the existing pool master host to compare the feature values and masking capabilities, and returns which type applies. With the examples above as E5502.txt and X3353.txt respectively, compare-cpu provides the following output: 

# ./compare-cpu E5502.txt X3353.txt -v
                file1: E5502.txt
                file2: X3353.txt
            pool_mask: ffffff7f-ffffffff-ffffffff-ffffffff
                CPU 1:
           model name: Intel(R) Xeon(R) CPU           E5502  @ 1.87GHz
             features: 009ce3bd-bfebfbff-00000001-28100800
        masking level: full
                CPU 2:
           model name: Intel(R) Xeon(R) CPU           X3353  @ 2.66GHz
             features: 000ce3bd-bfebfbff-00000001-20000800
        masking level: base
               Result: CPU 1 and CPU 2 are compatible for masking
            Mask type: 1  CPU 1 has a superset of features to CPU 2
                 Mask: 000ce3bd-bfebfbff-00000001-20000800
# ./compare-cpu X3353.txt E5502.txt -v
                file1: X3353.txt
                file2: E5502.txt
            pool_mask: ffffff7f-ffffffff-ffffffff-ffffffff
                CPU 1:
           model name: Intel(R) Xeon(R) CPU           X3353  @ 2.66GHz
             features: 000ce3bd-bfebfbff-00000001-20000800
        masking level: base
                CPU 2:
           model name: Intel(R) Xeon(R) CPU           E5502  @ 1.87GHz
             features: 009ce3bd-bfebfbff-00000001-28100800
        masking level: full
               Result: CPU 1 and CPU 2 are compatible for masking
            Mask type: 2  CPU 1 has a subset of features to CPU 2
                 Mask: 000ce3bd-bfebfbff-00000001-20000800

Joining the E5502 to a pool of X3353s represents type 1. Therefore, the reverse (joining a X3353 to a pool of E5502s) represents type 2, and for that case, the mask required on the E5502 for compatibility with the X3353 is simply the X3353's feature value. 
It is also possible to have CPU combinations with mutually exclusive features (type 3):

# ./compare-cpu X5560.txt E5420.txt -v
                file1: X5560.txt
                file2: E5420.txt
            pool_mask: ffffff7f-ffffffff-ffffffff-ffffffff
                CPU 1:
           model name: Intel(R) Xeon(R) CPU           X5560  @ 2.80GHz
             features: 009ce3bd-bfebfbff-00000001-28100800
        masking level: full
                CPU 2:
           model name: Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
             features: 040ce3bd-bfebfbff-00000001-20100800
        masking level: base
               Result: CPU 1 and CPU 2 are compatible for masking
            Mask type: 3  CPU 1 and CPU 2 have a mutually exclusive set of features but support a common mask
                 Mask: 000ce3bd-bfebfbff-00000001-20100800

The common mask must be applied to all hosts to address the mutually exclusive differences. Applying a CPU mask to hosts in an existing pool requires that any existing virtual machines in the pool must be shut down until all hosts in the pool have the new CPU configuration in effect. Use of a rolling approach where virtual machines are consolidated through migration, while hosts are rebooted in turn, cannot be used because virtual machine migration is not supported across hosts with disparate CPU configurations.
Manually: There are two sets of CPU features: base features and extended features. Both sets are separated into two halves, known as ecx and edx. Determining the masking options for a given pair of CPUs requires comparing the feature values in combination with the relative masking capability of each CPU. XenServer stores the feature bits in hexadecimal for brevity.

The type 3 examples above have the following features and masking support: 

|| CPU model || base_ecx || base_edx || ext_ecx || ext_edx || Masking level ||
| X5560       | 009ce3bd  | bfebfbff | 00000001  | 28100800 | full           |
| E5420       | 040ce3bd  | bfebfbff | 00000001  | 20100800 | base           |

The differences can be observed to be in base_ecx and ext_edx. Converting to binary shows the specific variance in supported feature bits: 

|| CPU model || base_ecx (bin)             || ext_edx (bin)                 ||
| X5560       | 000100111001110001110111101 |101000000100000000100000000000  |
| E5420       | 100000011001110001110111101 |100000000100000000100000000000  |
|             | x  x  x                     |  x                             |
|             | 26 23 20                  0 |  27                         0  | 

In base_ecx, bits 20 and 23 are present in the X5560 but not in the E5420, and bit 26 is present in the E5420 but not in the X5560. In ext_edx, bit 27 is present in the X5560 but not in the E5420. Because both CPUs support base masking and the X5560 supports full masking (base and extended), a joint mask is possible. The joint mask can be calculated by performing a bitwise AND to turn off the mutually exclusive feature bits: 

|| CPU model || base_ecx (bin)             || ext_edx (bin)                 ||
| X5560       | 000100111001110001110111101 | 101000000100000000100000000000 |
| E5420       | 100000011001110001110111101 | 100000000100000000100000000000 |
|             | x  x  x                     |   x                            |
| Joint       | 000000011001110001110111101 | 100000000100000000100000000000 |

Converting the joint base_ecx and ext_edx values back to hexadecimal and padding to eight digits gives: 

|| base_ecx || ext_edx ||
| 000ce3bd   | 20100800 |

Combining those values with the unchanged ext_ecx and base_edx values provides the joint mask: 

000ce3bd-bfebfbff-00000001-20100800

The pool.other-config:cpuid_feature_mask setting

Each XenServer (starting with version 5.6) pool contains a pool.other-config setting that is used during the evaluation of CPU compatibility. The cpuid_feature_mask value represents a set of feature bits to ignore while comparing CPU features. By default, this value is ffffff7f-ffffffff-ffffffff-ffffffff, which, after converting to binary, shows that only base_ecx bit 7 (the "EST" feature flag) is ignored to provide compatibility with XenServer 5.0 and 5.5.

Modifications to the cpuid_feature_mask should be done with great caution because it allows hosts with different features to be joined within a pool. If a virtual machine’s operating system or application detects and relies upon the presence of a specific feature, it might become unstable if migrated from a host that has the feature to one that does not.

Submitting new CPU pool HCL entries

Additional CPU combinations can be certified using the XenServer Server CPU Pooling Self-Test kit. Download the self-test kit from here. The kit includes details on testing requirements and how to submit results.

Disclaimer

The above mentioned sample code is provided to you as is with no representations, warranties or conditions of any kind. You may use, modify and distribute it at your own risk. CITRIX DISCLAIMS ALL WARRANTIES WHATSOEVER, EXPRESS, IMPLIED, WRITTEN, ORAL OR STATUTORY, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NONINFRINGEMENT. Without limiting the generality of the foregoing, you acknowledge and agree that (a) the sample code may exhibit errors, design flaws or other problems, possibly resulting in loss of data or damage to property; (b) it may not be possible to make the sample code fully functional; and (c) Citrix may, without notice or liability to you, cease to make available the current version and/or any future versions of the sample code. In no event should the code be used to support of ultra-hazardous activities, including but not limited to life support or blasting activities. NEITHER CITRIX NOR ITS AFFILIATES OR AGENTS WILL BE LIABLE, UNDER BREACH OF CONTRACT OR ANY OTHER THEORY OF LIABILITY, FOR ANY DAMAGES WHATSOEVER ARISING FROM USE OF THE SAMPLE CODE, INCLUDING WITHOUT LIMITATION DIRECT, SPECIAL, INCIDENTAL, PUNITIVE, CONSEQUENTIAL OR OTHER DAMAGES, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Although the copyright in the code belongs to Citrix, any distribution of the code should include only your own standard copyright attribution, and not that of Citrix. You agree to indemnify and defend Citrix against any and all claims arising from your use, modification or distribution of the code.

Automatic translation

Important: Non-English versions of this article are translated by an automatic translation system (also referred to as Machine Translation, or MT) and have not been translated or reviewed by a person. Citrix offers a machine translated version of this article to allow for greater access to the support content. However, automatic translation is not always perfect and may contain vocabulary, syntax or grammar errors. Citrix is not responsible for inconsistencies, errors or damage incurred as a result of the use of machine translated articles. Thank you.

Traduction automatique

Important : cet article a été traduit par un système de traduction automatique (également appelé Traduction automatique ou TA) et n'a pas été vérifié par des spécialistes. Citrix propose une traduction automatique de cet article afin de permettre à toute personne ne maîtrisant pas l'anglais d'accéder au contenu de l'assistance. Toutefois, la traduction automatique n'étant pas parfaite, elle peut contenir des erreurs de terminologie, de syntaxe ou de grammaire. Citrix n'est pas responsable des incohérences, erreurs ou dommages pouvant résulter de l'utilisation par nos clients d'articles TA.

Automatische vertaling

Belangrijk: Dit artikel is vertaald door een automatisch vertalingssysteem (ook Machine Translation of MT genoemd) en is niet vertaald of beoordeeld door mensen. Citrix biedt een machine-vertaalde versie van dit artikel aan om een betere toegang mogelijk te maken tot de support-inhoud. Automatisch vertalen werkt echter niet altijd perfect en het resultaat kan fouten bevatten in de woordkeuze, syntaxis of grammatica. Citrix is niet verantwoordelijk voor inconsistenties, fouten of schade als gevolg van het gebruik van MT-artikelen door onze klanten.

Maschinelle Übersetzung

Wichtig: Dieser Artikel wurde mit einem maschinellen Übersetzungssystem und ohne jegliche Bearbeitung durch Personen übersetzt. Citrix bietet maschinelle Übersetzungen von Artikeln an, damit Benutzer umfassenden Zugriff auf Support-Inhalte haben. Maschinelle Übersetzungen enthalten jedoch möglicherweise Fehler in Bezug auf Terminologie, Syntax und Grammatik. Citrix übernimmt keine Verantwortung für Inkonsistenzen, Fehler oder Schäden, die aus der Verwendung von maschinell übersetzten Artikeln durch Kunden resultieren.

自动翻译

重要提示:本文是由自动翻译系统翻译完成的(也称为“机器翻译”或 MT),未经人工翻译或审查。Citrix 提供本文的机器翻译版本是为了方便更多人访问支持内容。然而,自动翻译的文章并不总是完美的,可能存在词汇、语法或文法方面的错误。对于因客户使用机器翻译文章导致出现的不一致、错误或损害,Citrix 不承担任何责任。

機械翻訳

重要:この技術情報資料は機械翻訳システム(自動翻訳あるいはMTとも呼ぶ)により翻訳され、翻訳者により翻訳またはレビューされたものではありません。サポート用資料をより参照しやすくするため、Citrixはこの技術文書の機械翻訳バージョンを提供しています。しかしながら、機械翻訳の品質は翻訳者による翻訳ほど十分ではありません。誤訳や、文法、言葉使い、そのほか、たとえば日本語を母国語としない方が日本語を話すときに間違えるようなミスを含んでいる可能性があります。機械翻訳の品質、および技術情報資料の内容の誤訳やお客様が技術情報資料を利用されたことによって生じた直接または間接的な問題や損害については、いかなる責任も負わないものとします。

Tradução automática

Importante: este artigo foi traduzido por um sistema de tradução automática (também conhecido por Machine Translation ou MT) e não foi traduzido nem revisado por pessoas. A Citrix oferece uma versão traduzida por máquina deste artigo para permitir maior acesso ao conteúdo de suporte. No entanto, a tradução automática não é sempre perfeita, podendo conter erros de vocabulário, sintaxe ou gramática. A Citrix não se responsabiliza por inconsistências, erros ou danos incorridos como resultado do uso de artigos de MT de nossos clientes.

Traducción automática

Importante: Este artículo ha sido traducido por un sistema de Traducción automática (también llamada MT o Machine Translation) sin intervención de un traductor humano. Citrix ofrece la traducción automática de este artículo para ampliar el acceso a la información de asistencia técnica. No obstante, la traducción automática no es perfecta y puede contener errores de vocabulario, sintaxis y gramática. Citrix no se hace responsable de cualquier imprecisión, error o daño ocasionados por el uso que hagan nuestros clientes de los artículos traducidos automáticamente.
Languages
Was this helpful?
Thank you for your feedback

Share your comments or find out more about this topic

Citrix Forums