How to troubleshoot common SR-IOV related issues

How to troubleshoot common SR-IOV related issues

book

Article ID: CTX235045

calendar_today

Updated On:

Description

This article explains how to troubleshoot common SR-IOV Network related issues.
 


Instructions

Common error scenario #1: A VM fails to have SR-IOV network assigned.

  1. Please make sure there are enough available VFs to be assigned to the VM by checking "remaining-capacity". This can be confirmed from XenCenter or xe CLI.
  • XenCenter

User-added image

  • xe CLI
  1. )Get the UUID of the SR-IOV Network:

xe network-sriov-list

  1. )Check remaining-capacity of the network

xe network-sriov-param-list uuid=<SR-IOV_Network_UUID>

Sample output

User-added image

  1. In case of NIC legacy driver is updated in Dom0, please double check whether following parameters are changed in /etc/modprobe.d/<driver_name>.conf configuration file. If any of the value is incorrect, please manually change the value based on your requirement again.

param name
default max_vfs
sysfs interface support


Common error scenario #2: When trying to assign SR-IOV Network to a VM from XenCenter, there is no SR-IOV Network option in “Add Virtual Interface” Wizard for the VM.

If the VM is newly created in XenServer 7.6, please check if the VM has <restriction field="allow-network-sriov" value="1"/> tagged in recommendations by xe vm-param-list uuid=<vm_uuid> command.
If there is no such tag, that means the OS of the VM does not support SR-IOV feature, you should not assign SR-IOV network to that VM.

Output sample for a guest OS that supports SR-IOV Network:
recommendations ( RO): <restrictions><restriction field="memory-static-max" max="1649267441664"/><restriction field="vcpus-max" max="32"/><restriction field="has-vendor-device" value="false"/><restriction field="allow-gpu-passthrough" value="1"/><restriction field="allow-vgpu" value="1"/><restriction field="allow-network-sriov" value="1"/><restriction max="255" property="number-of-vbds"/><restriction max="7" property="number-of-vifs"/></restrictions>

Please note, if a VM is exported from a previous XenServer version, even the guest OS supports SR-IOV feature, it has no above restriction field. You will need to assign SR-IOV Network to that VM by xe CLI.


Common error scenario #3: SR-IOV Network is assigned to a VM by xe CLI, while inside the VM, the SR-IOV Network is not visible.

In this case, please make sure

  1. The guest OS supports SR-IOV. Please refer to scenario 2 to check if the OS supports SR-IOV feature.
  2. The NIC driver has been installed on the guest already.

Example for Windows Server OS:

User-added image

Example for Linux Server:

ethtool -i eth1
driver: ixgbevf
version: 3.2.2-k-rh7.4
firmware-version:
expansion-rom-version:
bus-info: 0000:00:05.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

  1.  If the VM was running when you were assigning SR-IOV Network to it. For Windows Server OS, SR-IOV Network will not be visible until the VM has been shut down and rebooted.

Please understand, you can assign SR-IOV Network to any VM by xe CLI. So please be sure that you are assigning it to a guest that supports this feature. Even SR-IOV Network may work on a non-supported guest, Citrix does NOT support this use case if any problem occurs.


Common error scenario #4: VM fails to start after SR-IOV Network has been assigned.

In this, you need to check kern.log and xensource.log with keyword “sriov” to find more information.

Reference:
For your reference, here are messages are outputted for working scenario.

xensource.log output when VF is attached and active for the VM.
broadman xenopsd-xc: [debug|localhost.localdomain|16 |Async.VM.start R:34ed6712dab6|xenops] Device.NetSriovVf.add domid=90 devid=2 pci=0000:03:10.1 vlan=none mac=fa:75:91:01:17:41 carrier=true rate=none other_config=[] extra_private_keys=[net-sriov-vf-id=2; xenopsd-backend=classic] extra_xenserver_keys=[static-ip-setting/mac=fa:75:91:01:17:41; static-ip-setting/error-code=0; static-ip-setting/error-msg=; static-ip-setting/enabled=0; static-ip-setting/enabled6=0; mac=fa:75:91:01:17:41]
...
broadman xenopsd-xc: [debug|localhost.localdomain|16 |Async.VM.start R:34ed6712dab6|xenops] adding device  B0[/local/domain/0/xenserver/backend/net-sriov-vf/90/2]  F90[/local/domain/90/xenserver/device/net-sriov-vf/2]  H[/xapi/8f93c150-0fe5-a053-8748-47a22d1ef023/hotplug/90/net-sriov-vf/2]


xensource.log output which indicates SR-IOV is not active on the VM yet.

broadman xenopsd-xc: [debug|localhost.localdomain|4361 |org.xen.xapi.xenops.classic events D:de40a7d84a1f|xenops] VM = 8f93c150-0fe5-a053-8748-47a22d1ef023; domid = 34; Device is not active: kind = net-sriov-vf; id = 2; active devices = [  ]

 
Common error scenario #5:In SR-IOV network enabled usage case, a host may not be able to join an existing pool.
If you have enabled SR-IOV network on a host, and want to have that host join an existing pool, it will fail. Only when the not having SR-IOV configured, it can join an existing pool. Please refer to the below table for more details.
 
 
 
pool (Network SR-IOV enabled)
pool(Network SR-IOV not configured)
host (Network SR-IOV enabled)Joining to existing pool will be blocked since the new host can be considered as a not clean host.Joining to existing pool will be blocked since the new host can be considered as a not clean host.
host (Network SR-IOV not configured)The host can join the pool, but with following notice:
1. The SR-IOV network of the newly joined host will be enabled AUTOMATICALLY if and only if the NIC on host and the NIC on pool master have same NIC type and same NIC position; this is checked by XAPI;
2. If PCI type of same NIC position for pool master and newly joined host are different, then SR-IOV network will not be enabled for that NIC in the joined host; even the host can join the pool.
3. Different types of SR-IOV physical PIFs can NOT be put into one network;
4. In the pool, user can enable a SR-IOV network for the newly joined host by XE CLI if the SR-IOV PIF has same type with the pool master's PIF in that network, even they are in different positions.
The host can join the existing pool.

Common error scenario #6: A VM is configured as both SR-IOV VF and vGPU assigned, but when the VM is started, it has no SR-IOV VF assigned.
If a VM is configured as both SR-IOV VF and vGPU assigned, the selection of host to start the VM will respect vGPU's host selection only. After host selection based on vGPU resource, XAPI only asserts on the selected host based on SR-IOV network remaining capability. This is because that vGPU is a more expensive and rarer resource than SR-IOV VF. 

Additional Information

How to Use SR-IOV Feature in XenServer 
How To Assign SR-IOV Network to a VM exported from a previous XenServer Version