Update driver to map the targeted address for SR-IOV PCI devices
This patch checks the revision of QEMU and libvirt to ensure support for VFIO SR-IOV device migration. It also updates the _live_migration_operation() function, particularly the get_updated_guest_xml() function, to map source PCI addresses to destination addresses in the destination XML file, using the data provided by the LiveMigrateData object. The target goal of these series of patch is to enable VFIO devices migration with kernel variant drivers. Partially-Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: I62ec475988eab8de948498f50d8d4c0d47321102
This commit is contained in:
parent
b227efd967
commit
fd656f3943
@ -36,7 +36,24 @@ to use move operations, for each ``nova-compute`` service.
|
||||
|
||||
Possible Values:
|
||||
|
||||
* A dictionary of JSON values which describe the aliases. For example::
|
||||
* A JSON dictionary which describe a PCI device. It should take
|
||||
the following format::
|
||||
|
||||
alias = {
|
||||
"name": "<name>",
|
||||
["product_id": "<id>"],
|
||||
["vendor_id": "<id>"],
|
||||
"device_type": "<type>",
|
||||
["numa_policy": "<policy>"],
|
||||
["resource_class": "<resource_class>"],
|
||||
["traits": "<traits>"]
|
||||
["live_migratable": "<live_migratable>"],
|
||||
}
|
||||
|
||||
Where ``[`` indicates zero or one occurrences, ``{`` indicates zero or
|
||||
multiple occurrences, and ``|`` mutually exclusive options.
|
||||
|
||||
For example::
|
||||
|
||||
alias = {
|
||||
"name": "QuickAssist",
|
||||
@ -46,8 +63,17 @@ Possible Values:
|
||||
"numa_policy": "required"
|
||||
}
|
||||
|
||||
This defines an alias for the Intel QuickAssist card. (multi valued). Valid
|
||||
key values are :
|
||||
This defines an alias for the Intel QuickAssist card. (multi valued).
|
||||
|
||||
Another example::
|
||||
|
||||
alias = {
|
||||
"name": "A16_16A",
|
||||
"device_type": "type-VF",
|
||||
resource_class: "CUSTOM_A16_16A",
|
||||
}
|
||||
|
||||
Valid key values are :
|
||||
|
||||
``name``
|
||||
Name of the PCI alias.
|
||||
@ -97,6 +123,22 @@ Possible Values:
|
||||
scheduling the request. This field can only be used only if
|
||||
``[filter_scheduler]pci_in_placement`` is enabled.
|
||||
|
||||
``live_migratable``
|
||||
Specify if live-migratable devices are desired.
|
||||
May have boolean-like string values case-insensitive values:
|
||||
"yes" or "no".
|
||||
|
||||
- ``live_migratable='yes'`` means that the user wants a device(s)
|
||||
allowing live migration to a similar device(s) on another host.
|
||||
|
||||
- ``live_migratable='no'`` This explicitly indicates that the user
|
||||
requires a non-live migratable device, making migration impossible.
|
||||
|
||||
- If not specified, the default is ``live_migratable=None``, meaning that
|
||||
either a live migratable or non-live migratable device will be picked
|
||||
automatically. However, in such cases, migration will **not** be
|
||||
possible.
|
||||
|
||||
* Supports multiple aliases by repeating the option (not by specifying
|
||||
a list value)::
|
||||
|
||||
@ -112,7 +154,8 @@ Possible Values:
|
||||
"product_id": "0444",
|
||||
"vendor_id": "8086",
|
||||
"device_type": "type-PCI",
|
||||
"numa_policy": "required"
|
||||
"numa_policy": "required",
|
||||
"live_migratable": "yes",
|
||||
}
|
||||
"""),
|
||||
cfg.MultiStrOpt('device_spec',
|
||||
@ -165,7 +208,9 @@ Possible values:
|
||||
Supported ``<tag>`` values are :
|
||||
|
||||
- ``physical_network``
|
||||
|
||||
- ``trusted``
|
||||
|
||||
- ``remote_managed`` - a VF is managed remotely by an off-path networking
|
||||
backend. May have boolean-like string values case-insensitive values:
|
||||
"true" or "false". By default, "false" is assumed for all devices.
|
||||
@ -174,6 +219,7 @@ Possible values:
|
||||
VPD capability with a card serial number (either on a VF itself on
|
||||
its corresponding PF), otherwise they will be ignored and not
|
||||
available for allocation.
|
||||
|
||||
- ``managed`` - Specify if the PCI device is managed by libvirt.
|
||||
May have boolean-like string values case-insensitive values:
|
||||
"yes" or "no". By default, "yes" is assumed for all devices.
|
||||
@ -189,6 +235,18 @@ Possible values:
|
||||
|
||||
Warning: Incorrect configuration of this parameter may result in compute
|
||||
node crashes.
|
||||
|
||||
- ``live_migratable`` - Specify if the PCI device is live_migratable by
|
||||
libvirt.
|
||||
May have boolean-like string values case-insensitive values:
|
||||
"yes" or "no". By default, "no" is assumed for all devices.
|
||||
|
||||
- ``live_migratable='yes'`` means that the device can be live migrated.
|
||||
Of course, this requires hardware support, as well as proper system
|
||||
and hypervisor configuration.
|
||||
|
||||
- ``live_migratable='no'`` means that the device cannot be live migrated.
|
||||
|
||||
- ``resource_class`` - optional Placement resource class name to be used
|
||||
to track the matching PCI devices in Placement when
|
||||
[pci]report_in_placement is True.
|
||||
@ -202,6 +260,7 @@ Possible values:
|
||||
device's ``vendor_id`` and ``product_id`` in the form of
|
||||
``CUSTOM_PCI_{vendor_id}_{product_id}``.
|
||||
The ``resource_class`` can be requested from a ``[pci]alias``
|
||||
|
||||
- ``traits`` - optional comma separated list of Placement trait names to
|
||||
report on the resource provider that will represent the matching PCI
|
||||
device. Each trait can be a standard trait from ``os-traits`` lib or can
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -35,6 +35,14 @@ from nova.virt.libvirt import host
|
||||
from nova.virt.libvirt import migration
|
||||
|
||||
|
||||
def _normalize(xml_str):
|
||||
return etree.tostring(
|
||||
etree.fromstring(xml_str),
|
||||
pretty_print=True,
|
||||
encoding="unicode",
|
||||
).strip()
|
||||
|
||||
|
||||
class UtilityMigrationTestCase(test.NoDBTestCase):
|
||||
|
||||
def test_graphics_listen_addrs(self):
|
||||
@ -278,6 +286,193 @@ class UtilityMigrationTestCase(test.NoDBTestCase):
|
||||
self.assertRaises(exception.NovaException,
|
||||
migration._update_mdev_xml, doc, data.target_mdevs)
|
||||
|
||||
def test_update_pci_dev_xml(self):
|
||||
|
||||
xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x00' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
expected_xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x26' slot='0x01' function='0x5'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
data = objects.LibvirtLiveMigrateData(
|
||||
pci_dev_map_src_dst={"0000:25:00.4": "0000:26:01.5"})
|
||||
doc = etree.fromstring(xml_pattern)
|
||||
res = migration._update_pci_dev_xml(doc, data.pci_dev_map_src_dst)
|
||||
self.assertEqual(
|
||||
_normalize(expected_xml_pattern),
|
||||
etree.tostring(res, encoding="unicode", pretty_print=True).strip(),
|
||||
)
|
||||
|
||||
def test_update_pci_dev_xml_with_2_hostdevs(self):
|
||||
|
||||
xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x00' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x01' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x06' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
expected_xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x26' slot='0x01' function='0x5'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x26' slot='0x01' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x06' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
data = objects.LibvirtLiveMigrateData(
|
||||
pci_dev_map_src_dst={
|
||||
"0000:25:00.4": "0000:26:01.5",
|
||||
"0000:25:01.4": "0000:26:01.4",
|
||||
}
|
||||
)
|
||||
doc = etree.fromstring(xml_pattern)
|
||||
res = migration._update_pci_dev_xml(doc, data.pci_dev_map_src_dst)
|
||||
self.assertEqual(
|
||||
_normalize(expected_xml_pattern),
|
||||
etree.tostring(res, encoding="unicode", pretty_print=True).strip(),
|
||||
)
|
||||
|
||||
def test_update_pci_dev_xml_with_2_hostdevs_second_one_not_in_map(self):
|
||||
|
||||
xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x00' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x01' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x06' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
expected_xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x26' slot='0x01' function='0x5'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x01' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x06' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
data = objects.LibvirtLiveMigrateData(
|
||||
pci_dev_map_src_dst={
|
||||
"0000:25:00.4": "0000:26:01.5",
|
||||
}
|
||||
)
|
||||
doc = etree.fromstring(xml_pattern)
|
||||
res = migration._update_pci_dev_xml(doc, data.pci_dev_map_src_dst)
|
||||
self.assertEqual(
|
||||
_normalize(expected_xml_pattern),
|
||||
etree.tostring(res, encoding="unicode", pretty_print=True).strip(),
|
||||
)
|
||||
|
||||
def test_update_pci_dev_xml_fails_not_found_src_address(self):
|
||||
xml_pattern = """<domain>
|
||||
<devices>
|
||||
<hostdev mode='subsystem' type='pci' managed='no'>
|
||||
<driver name='vfio'/>
|
||||
<source>
|
||||
<address domain='0x0000' bus='0x25' slot='0x00' function='0x4'/>
|
||||
</source>
|
||||
<alias name='hostdev0'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00'
|
||||
slot='0x05' function='0x0'/>
|
||||
</hostdev>
|
||||
</devices>
|
||||
</domain>"""
|
||||
data = objects.LibvirtLiveMigrateData(
|
||||
pci_dev_map_src_dst={"0000:25:00.5": "0000:26:01.5"})
|
||||
doc = etree.fromstring(xml_pattern)
|
||||
exc = self.assertRaises(
|
||||
exception.NovaException,
|
||||
migration._update_pci_dev_xml,
|
||||
doc,
|
||||
data.pci_dev_map_src_dst,
|
||||
)
|
||||
|
||||
norm = _normalize(xml_pattern)
|
||||
|
||||
self.assertIn(
|
||||
'Unable to find the hostdev '
|
||||
f'to replace for this source PCI address: 0000:25:00.5 '
|
||||
f'in the xml: {norm}',
|
||||
str(exc),
|
||||
)
|
||||
|
||||
def test_update_cpu_shared_set_xml(self):
|
||||
doc = etree.fromstring("""
|
||||
<domain>
|
||||
|
@ -97,6 +97,7 @@ from nova.objects import diagnostics as diagnostics_obj
|
||||
from nova.objects import fields
|
||||
from nova.objects import migrate_data as migrate_data_obj
|
||||
from nova.pci import utils as pci_utils
|
||||
from nova.pci import whitelist
|
||||
import nova.privsep.libvirt
|
||||
import nova.privsep.path
|
||||
import nova.privsep.utils
|
||||
@ -266,6 +267,10 @@ MIN_LIBVIRT_STATELESS_FIRMWARE = (8, 6, 0)
|
||||
MIN_IGB_LIBVIRT_VERSION = (9, 3, 0)
|
||||
MIN_IGB_QEMU_VERSION = (8, 0, 0)
|
||||
|
||||
# Minimum versions supporting vfio-pci variant driver.
|
||||
MIN_VFIO_PCI_VARIANT_LIBVIRT_VERSION = (10, 0, 0)
|
||||
MIN_VFIO_PCI_VARIANT_QEMU_VERSION = (8, 2, 2)
|
||||
|
||||
REGISTER_IMAGE_PROPERTY_DEFAULTS = [
|
||||
'hw_machine_type',
|
||||
'hw_cdrom_bus',
|
||||
@ -902,10 +907,35 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||
|
||||
self._check_multipath()
|
||||
|
||||
# Even if we already checked the whitelist at startup, this driver
|
||||
# needs to check specific hypervisor versions
|
||||
self._check_pci_whitelist()
|
||||
|
||||
# Set REGISTER_IMAGE_PROPERTY_DEFAULTS in the instance system_metadata
|
||||
# to default values for properties that have not already been set.
|
||||
self._register_all_undefined_instance_details()
|
||||
|
||||
def _check_pci_whitelist(self):
|
||||
|
||||
need_specific_version = False
|
||||
|
||||
if CONF.pci.device_spec:
|
||||
pci_whitelist = whitelist.Whitelist(CONF.pci.device_spec)
|
||||
for spec in pci_whitelist.specs:
|
||||
if spec.tags.get("live_migratable"):
|
||||
need_specific_version = True
|
||||
|
||||
if need_specific_version and not self._host.has_min_version(
|
||||
lv_ver=MIN_VFIO_PCI_VARIANT_LIBVIRT_VERSION,
|
||||
hv_ver=MIN_VFIO_PCI_VARIANT_QEMU_VERSION,
|
||||
hv_type=host.HV_DRIVER_QEMU,
|
||||
):
|
||||
msg = _(
|
||||
"PCI device spec is configured for "
|
||||
"live_migratable but it's not supported by libvirt."
|
||||
)
|
||||
raise exception.InvalidConfiguration(msg)
|
||||
|
||||
def _update_host_specific_capabilities(self) -> None:
|
||||
"""Update driver capabilities based on capabilities of the host."""
|
||||
# TODO(stephenfin): We should also be reporting e.g. SEV functionality
|
||||
|
@ -16,7 +16,6 @@
|
||||
"""Utility methods to manage guests migration
|
||||
|
||||
"""
|
||||
|
||||
from collections import deque
|
||||
|
||||
from lxml import etree
|
||||
@ -88,6 +87,11 @@ def get_updated_guest_xml(instance, guest, migrate_data, get_volume_config,
|
||||
xml_doc = _update_numa_xml(xml_doc, migrate_data)
|
||||
if 'target_mdevs' in migrate_data:
|
||||
xml_doc = _update_mdev_xml(xml_doc, migrate_data.target_mdevs)
|
||||
if "pci_dev_map_src_dst" in migrate_data:
|
||||
xml_doc = _update_pci_dev_xml(
|
||||
xml_doc, migrate_data.pci_dev_map_src_dst
|
||||
)
|
||||
|
||||
if new_resources:
|
||||
xml_doc = _update_device_resources_xml(xml_doc, new_resources)
|
||||
return etree.tostring(xml_doc, encoding='unicode')
|
||||
@ -149,6 +153,77 @@ def _update_mdev_xml(xml_doc, target_mdevs):
|
||||
return xml_doc
|
||||
|
||||
|
||||
def _update_pci_dev_xml(xml_doc, pci_dev_map_src_dst):
|
||||
hostdevs = xml_doc.findall('./devices/hostdev')
|
||||
|
||||
for src_addr, dst_addr in pci_dev_map_src_dst.items():
|
||||
src_fields = _get_pci_address_fields_with_prefix(src_addr)
|
||||
dst_fields = _get_pci_address_fields_with_prefix(dst_addr)
|
||||
|
||||
if not _update_hostdev_address(hostdevs, src_fields, dst_fields):
|
||||
_raise_hostdev_not_found_exception(xml_doc, src_addr)
|
||||
|
||||
LOG.debug(
|
||||
'_update_pci_xml output xml=%s',
|
||||
etree.tostring(xml_doc, encoding='unicode', pretty_print=True)
|
||||
)
|
||||
return xml_doc
|
||||
|
||||
|
||||
def _get_pci_address_fields_with_prefix(addr):
|
||||
(domain, bus, slot, func) = nova.pci.utils.get_pci_address_fields(addr)
|
||||
return (f"0x{domain}", f"0x{bus}", f"0x{slot}", f"0x{func}")
|
||||
|
||||
|
||||
def _update_hostdev_address(hostdevs, src_fields, dst_fields):
|
||||
src_domain, src_bus, src_slot, src_function = src_fields
|
||||
dst_domain, dst_bus, dst_slot, dst_function = dst_fields
|
||||
|
||||
for hostdev in hostdevs:
|
||||
if hostdev.get('type') != 'pci':
|
||||
continue
|
||||
|
||||
address_tag = hostdev.find('./source/address')
|
||||
if address_tag is None:
|
||||
continue
|
||||
|
||||
if _address_matches(
|
||||
address_tag, src_domain, src_bus, src_slot, src_function
|
||||
):
|
||||
_set_address_fields(
|
||||
address_tag, dst_domain, dst_bus, dst_slot, dst_function
|
||||
)
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def _address_matches(address_tag, domain, bus, slot, function):
|
||||
return (
|
||||
address_tag.get('domain') == domain and
|
||||
address_tag.get('bus') == bus and
|
||||
address_tag.get('slot') == slot and
|
||||
address_tag.get('function') == function
|
||||
)
|
||||
|
||||
|
||||
def _set_address_fields(address_tag, domain, bus, slot, function):
|
||||
address_tag.set('domain', domain)
|
||||
address_tag.set('bus', bus)
|
||||
address_tag.set('slot', slot)
|
||||
address_tag.set('function', function)
|
||||
|
||||
|
||||
def _raise_hostdev_not_found_exception(xml_doc, src_addr):
|
||||
xml = etree.tostring(
|
||||
xml_doc, encoding="unicode", pretty_print=True
|
||||
).strip()
|
||||
raise exception.NovaException(
|
||||
'Unable to find the hostdev to replace for this source PCI '
|
||||
f'address: {src_addr} in the xml: {xml}'
|
||||
)
|
||||
|
||||
|
||||
def _update_cpu_shared_set_xml(xml_doc, migrate_data):
|
||||
LOG.debug('_update_cpu_shared_set_xml input xml=%s',
|
||||
etree.tostring(xml_doc, encoding='unicode', pretty_print=True))
|
||||
|
@ -0,0 +1,8 @@
|
||||
---
|
||||
features:
|
||||
- |
|
||||
This release adds support for migrating SR-IOV devices
|
||||
using the new kernel VFIO SR-IOV variant driver interface.
|
||||
See the `OpenStack configuration documentation`__ for more details.
|
||||
|
||||
.. __: https://docs.openstack.org/nova/latest/configuration/config.html#pci
|
Loading…
x
Reference in New Issue
Block a user