In a virtual infrastructure, the capacity management process and performance management go hand in hand. Performance management prevents resource bottlenecks, and capacity management helps administrators stay on top of resource needs. But finding these bottlenecks can be difficult without the right tools.
You can use your virtual platform’s performance monitoring tools to help with the capacity management process for all resources: processing, network, storage and memory. The difficult part is determining what to look for and what to do with the information you gather.
There are four resource bottlenecks to consider in the capacity management process. Some of these chokepoints are obvious, as are their fixes. Others aren’t. Some can even be exacerbated if you don’t understand the tools or metrics at your disposal. So as you begin the capacity management process, which is truly ongoing in a virtual infrastructure, consider these four bottlenecks that can affect virtual machine (VM) performance.
Excessive resource consumption
The capacity management process is continually ongoing, and you should rely on short- and long-term metrics to make capacity decisions. Long-term metrics assume that virtual servers exist in a steady state during their daily operation. Once they’re turned on and performing their assigned duties, most servers settle into a predictable pattern of resource usage, which removes much of the guesswork from the capacity management process.
But bottlenecks occur when a VM breaks that pattern and begins to consume more resources than expected -- either because more users are working with the application, administrators are doing maintenance on the VM or an unknown (or unwanted) process on the server is performing an action that consumes resources.
Every virtualization platform includes controls to limit VM consumption when resources are in high demand. You can set minimum and maximum levels on any VM, but it’s not easy to determine where to set them.
Setting a maximum level, for example, establishes an upper boundary on that VM’s performance. If a VM suddenly requires additional resources, that ceiling can negatively affect performance. An alternative solution is to set minimum levels -- not on the problematic machine, but on the others. Doing so ensures that a minimum amount of resources is always available to a VM. It also protects the performance of high-value VMs but doesn’t inhibit VMs that need extra resources for a short-term situation.
Insufficient host resources
When there aren’t enough resources on a host to go around, bottlenecks can result -- either because there are too many VMs on a host, or because VMs have demanded too many resources. In the capacity management process, it’s useful to know the supply-and-demand economics of virtual infrastructures.
When this bottleneck occurs, many administrators build a cluster of virtual hosts. Clusters facilitate VM rebalancing as well as resource allocation and distribution according to server supply. The quickest solution, unfortunately, is to just buy more hardware. You could also decommission a few VMs, but that’s usually not an acceptable solution.
Insufficient cluster resources
Host clusters are still susceptible to capacity problems, though. Even the most powerful cluster in the world can become resource-constrained if you just keep adding VMs.
Host clusters combine the resources of multiple hosts to create an aggregate quantity of resources that can be assigned to virtual machines. The capacity management process in a resource-constrained cluster requires careful monitoring and alerting to ensure that there are enough resources available to meet the demands of the collective VMs.
An important capacity management method in clusters is reserving cluster resources. Every virtual platform vendor recommends a maximum load for virtual hosts, usually somewhere around 75% or 85%. That buffer of unused space ensures that a cluster can meet resource demands if a VM requires more than its expected quantity of resources. The cluster itself also requires extra resources -- called a cluster reserve -- which represents the amount of resources needed to re-host VMs in case a host goes down.
Those reserves are important because, without them, a host failure or resource spike will affect VM performance. Again, if you still find yourself without enough resources, the easiest solution is usually to buy more hardware.
With a solid capacity management process, you can prevent these resource bottlenecks and dynamically allocate resources to the VMs that need them most.
This was first published in January 2011