Use VMware Admission Control to reserve resources, control slot policy and ensure VM failover is possible.
A high availability system relies on failover to restart a failed node on an available host. But in a VMware environment, the success of a failover depends on tools, such as Admission Control, to make sure adequate resources are available in the failover target. If the target host doesn't have enough resources, the failover won't occur, which might compromise the integrity of the high availability (HA) cluster.
You can use reservation techniques on a target host system to reserve the resources required for a VM failover, which ensures that a VM cluster node can successfully fail over to the desired host. VMware vSphere calls this Admission Control. If an HA reservation tool such as VMware Admission Control holds back a host's resources, it might be impossible to perform other tasks if they could result in inadequate resources left for the failover.
The best way to reserve HA resources with VMware Admission Control is to first determine the amount of resources necessary for the failover tasks. Generally, the decision is a little more involved than simply carving out resources for a single VM. VSphere HA uses three approaches to determine resource reservations on a host system.
The first VMware Admission Control approach reserves a portion of the HA cluster's processor and memory resources for failovers -- the cluster resource percentage. This process basically counts the processor and memory resources that VMs in a cluster use, adds up the processor and memory resources available on the hosts in the cluster, and then calculates a ratio of the actual processor and memory resources available.
VSphere HA then compares those percentages to the percentage of processor and memory capacity that the cluster currently has reserved. If there are more resources available than are reserved, the system allows the reservation.
For example, suppose the VMs in an HA cluster have 63% processor capacity remaining and 60% memory capacity remaining with the HA cluster nodes all running. If the reservation is set to 20%, that means 43% of the cluster's processor capacity and 40% of the cluster's memory capacity is still available to start new VMs and expand VM resource allocations.
Use the slot policy option to calculate and reserve resources
The second approach that vSphere HA and VMware Admission Control use to determine resource reservations for HA is a slot policy. This approach basically calculates and reserves the resources necessary to fail over all the VMs from a required number of cluster hosts. In other words, a slot policy can determine the resources necessary to recover from one, two or more cluster host failures or figure out how many of the hosts can fail given the available resources in the cluster.
Essentially, a slot policy calculates the size of a slot -- the reserved processor and memory resources necessary to fail over any VM in the cluster. The slot policy then determines the number of such slots available in the cluster and compares the number of available slots to the number of VMs in the cluster.
The goal is to determine the number of host systems in the cluster that can fail, while still having enough available slots in the cluster to fail over all of the involved VMs. This is the failover capacity calculation under the slot policy.
In general terms, there must be at least enough slots to accommodate all of the VMs on any one host system to create a failover capacity of one. If there are at least enough slots to accommodate all of the VMs on any two host systems, the failover capacity is two.
Finally, VMware Admission Control and vSphere HA can use dedicated failover hosts within the cluster. When a host fails, vSphere HA will first try to fail over the VMs to a designated failover host. If that doesn't work, vSphere HA will try to place the VMs elsewhere in the cluster.
If you designate a system for failover, you can't start up or migrate VMs to those hosts, and Distributed Resource Scheduler will avoid using those hosts.
Dig Deeper on Disaster recovery, failover and high availability for virtual servers
Related Q&A from Stephen J. Bigelow
Regression tests and UAT ensure software quality and both require a sizeable investment. Learn when and how to perform each one, and some tips to get... Continue Reading
Learn the meaning of functional vs. nonfunctional requirements in software engineering, with helpful examples. Then, see how to write both and build ... Continue Reading
Just because software passes functional tests doesn't mean it works. Dig into stress, load, endurance and other performance tests, and their ... Continue Reading