diff --git a/doc/source/contributor/index.rst b/doc/source/contributor/index.rst index f16eb55d6033..662373aceebe 100644 --- a/doc/source/contributor/index.rst +++ b/doc/source/contributor/index.rst @@ -162,6 +162,8 @@ diving in. * :doc:`/contributor/evacuate-vs-rebuild`: Describes the differences between the often-confused evacuate and rebuild operations. + * :doc:`/contributor/resize-and-cold-migrate`: Describes the differences and + similarities between resize and cold migrate operations. .. # NOTE(amotoki): toctree needs to be placed at the end of the secion to # keep the document structure in the PDF doc. @@ -169,3 +171,4 @@ diving in. :hidden: evacuate-vs-rebuild + resize-and-cold-migrate diff --git a/doc/source/contributor/resize-and-cold-migrate.rst b/doc/source/contributor/resize-and-cold-migrate.rst new file mode 100644 index 000000000000..b2b267cae9ce --- /dev/null +++ b/doc/source/contributor/resize-and-cold-migrate.rst @@ -0,0 +1,131 @@ +======================= +Resize and cold migrate +======================= + +The `resize API`_ and `cold migrate API`_ are commonly confused in nova because +the internal `API code`_, `conductor code`_ and `compute code`_ use the same +methods. This document explains some of the differences in what +happens between a resize and cold migrate operation. + +High level +~~~~~~~~~~ + +:doc:`Cold migrate ` is an operation performed by an +administrator to power off and move a server from one host to a **different** +host using the **same** flavor. Volumes and network interfaces are disconnected +from the source host and connected on the destination host. The type of file +system between the hosts and image backend determine if the server files and +disks have to be copied. If copy is necessary then root and ephemeral disks are +copied and swap disks are re-created. + +:doc:`Resize ` is an operation which can be performed by a +non-administrative owner of the server (the user) with a **different** flavor. +The new flavor can change certain aspects of the server such as the number of +CPUS, RAM and disk size. Otherwise for the most part the internal details are +the same as a cold migration. + +Scheduling +~~~~~~~~~~ + +Depending on how the API is configured for +:oslo.config:option:`allow_resize_to_same_host`, the server may be able to be +resized on the current host. *All* compute drivers support *resizing* to the +same host but *only* the vCenter driver supports *cold migrating* to the same +host. Enabling resize to the same host is necessary for features such as +strict affinity server groups where there are more than one server in the same +affinity group. + +Starting with `microversion 2.56`_ an administrator can specify a destination +host for the cold migrate operation. Resize does not allow specifying a +destination host. + +Flavor +~~~~~~ + +As noted above, with resize the flavor *must* change and with cold migrate the +flavor *will not* change. + +Resource claims +~~~~~~~~~~~~~~~ + +Both resize and cold migration perform a `resize claim`_ on the destination +node. Historically the resize claim was meant as a safety check on the selected +node to work around race conditions in the scheduler. Since the scheduler +started `atomically claiming`_ VCPU, MEMORY_MB and DISK_GB allocations using +Placement the role of the resize claim has been reduced to detecting the same +conditions but for resources like PCI devices and NUMA topology which, at least +as of the 20.0.0 (Train) release, are not modeled in Placement and as such are +not atomic. + +If this claim fails, the operation can be rescheduled to an alternative +host, if there are any. The number of possible alternative hosts is determined +by the :oslo.config:option:`scheduler.max_attempts` configuration option. + +Allocations +~~~~~~~~~~~ + +Since the 16.0.0 (Pike) release, the scheduler uses the `placement service`_ +to filter compute nodes (resource providers) based on information in the flavor +and image used to build the server. Once the scheduler runs through its filters +and weighers and picks a host, resource class `allocations`_ are atomically +consumed in placement with the server as the consumer. + +During both resize and cold migrate operations, the allocations held by the +server consumer against the source compute node resource provider are `moved`_ +to a `migration record`_ and the scheduler will create allocations, held by the +instance consumer, on the selected destination compute node resource provider. +This is commonly referred to as `migration-based allocations`_ which were +introduced in the 17.0.0 (Queens) release. + +If the operation is successful and confirmed, the source node allocations held +by the migration record are `dropped`_. If the operation fails or is reverted, +the source compute node resource provider allocations held by the migration +record are `reverted`_ back to the instance consumer and the allocations +against the destination compute node resource provider are dropped. + +Summary of differences +~~~~~~~~~~~~~~~~~~~~~~ + +.. list-table:: + :header-rows: 1 + + * - + - Resize + - Cold migrate + * - New flavor + - Yes + - No + * - Authorization (default) + - Admin or owner (user) + + Policy rule: ``os_compute_api:servers:resize`` + - Admin only + + Policy rule: ``os_compute_api:os-migrate-server:migrate`` + * - Same host + - Maybe + - Only vCenter + * - Can specify target host + - No + - Yes (microversion >= 2.56) + +Sequence Diagram +~~~~~~~~~~~~~~~~ + +.. todo:: Add something like the :doc:`/reference/live-migration` diagram. + +.. _resize API: https://docs.openstack.org/api-ref/compute/#resize-server-resize-action +.. _cold migrate API: https://docs.openstack.org/api-ref/compute/#migrate-server-migrate-action +.. _API code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/api.py#L3568 +.. _conductor code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/conductor/manager.py#L297 +.. _compute code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4445 +.. _microversion 2.56: https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id52 +.. _resize claim: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/resource_tracker.py#L248 +.. _atomically claiming: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/scheduler/filter_scheduler.py#L239 +.. _moved: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/conductor/tasks/migrate.py#L28 +.. _placement service: https://docs.openstack.org/placement/latest/ +.. _allocations: https://docs.openstack.org/api-ref/placement/#allocations +.. _migration record: https://docs.openstack.org/api-ref/compute/#migrations-os-migrations +.. _migration-based allocations: https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/migration-allocations.html +.. _dropped: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4048 +.. _reverted: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4233