In physical server clusters, high availability can be compromised if you don't reserve enough cluster resources for the technology to function properly.
High-availability problems can occur when you don't purchase enough hardware to handle the resource demands of failed virtual machines (VMs). Without sufficient cluster resources reserved for VMs to use during failover, high availability suffers.
Setting aside high-availability cluster resources
Every infrastructure that supports high availability needs cluster resources set aside specifically to handle a failover. Typically, that quantity -- called a cluster reserve -- should be equal to the full amount of resources that one server supplies. If a server can supply 10 MHz of processing power and 1 GB of RAM, for example, then you should reserve that same quantity of cluster resources for failover.
For more on cluster resources and high availability:
Server cluster high-availability gotchas: DNS and failover
Killing Hyper-V high-availability cluster services issues
Configuring nodes with open source cluster resource management tools
High availability guide
Data center planning considerations for high availability
Most admins consider unused cluster resources to be wasted resources. That's understandable, because letting those resources sit idle goes against virtualization's touted resource-optimization storyline. But what too many administrators don't recognize is that the cluster reserve must be set aside and left unused at all times. If active VMs are using those cluster resources, then they can't be used by VMs during a failover.
When reserving cluster resources is a good idea
In larger server clusters, the idea of unused cluster resources isn't typically a big deal. If your data center can afford to buy 100 servers, then setting aside one of them for a high-availability cluster reserve only "wastes" 1% of your total investment. That's an acceptable loss.
In smaller server clusters, however, a cluster reserve can be tricky to justify. If your cluster comprises only four servers, it's difficult to explain to the boss that you need to set aside one full server -- 25% of your total investment -- just in case a VM fails. Luckily, cluster hosts don't really die all that often. Today's server hardware tends to be built with enough redundancy that wholesale failures rarely happen.
Other ways to ensure high availability
One solution for minimizing wasted cluster resources is to set aside less than a server's full capacity, which is a feature in all the major hypervisors. For example, in VMware vSphere you can reserve any percentage of cluster resources as spare capacity.
No matter which hypervisor you use, reconfigure your failover options to prioritize hosts in an appropriate failover order. Every infrastructure has its tier-one servers, as well as tier-two and tier-three servers that aren't as important. If you choose to reduce the cluster reserve, configure your failover options so your tier-one servers fail over first.
Configuring failover order ensures that your high-value workloads automatically reboot after a host failure. This alternative approach is a good way to boost high availability, especially in small shops that can't reserve cluster resources.
Whenever you plan to purchase hardware for virtualization, buy enough servers for your current needs, your expected needs until the next purchasing cycle, and a reserve of cluster resources. That way, you'll have all the hardware you need to fully protect your infrastructure against misbehaving virtual hosts.
About the author:
Greg Shields is an independent author, instructor, Microsoft MVP and IT consultant based in Denver. He is a co-founder of Concentrated Technology LLC and has nearly 15 years of experience in IT architecture and enterprise administration. Shields specializes in Microsoft administration, systems management and monitoring, and virtualization. He is the author of several books, including Windows Server 2008: What's New/What's Changed, available from Sapien Press.