What is capacity planning?
First, what is capacity planning in virtual environments? It has nothing, or little to do, with how much storage you are going to need and how that storage is going to be deployed. In virtual environments we are considering how much processing computer power your current applications require, how much processing you have available to you and how to distribute that load in a virtual environment. In addition, it takes into account allocation of "room" for the additional compute requirements of workloads that may be redistributed through your virtualization software's ability to move virtual machines (VMs) between physical hosts.
Capacity planning is not a one-time event, but rather an ongoing process. It is one that can be done manually, but it's not an exact science and there is heavy reliance on the instincts of the IT staff. The automated solutions, for a fee, try to make capacity planning more of a science.
Even if you are going to use an automated tool, an understanding of what needs to be analyzed is critical so that you can make sure it
For the most part, there is no priority in capacity planning as the entire environment exists as one and each component impacts another component. A holistic view of the infrastructure is required and each of the areas of interest must be examined as a whole.
The manual process described here focuses on getting the job done, simply and quickly. As stated earlier it counts heavily on the instincts of the IT staff, which are usually very accurate.
The first step in capacity planning is an inventory. Here you will collect information about what physical hardware you have and what applications are running on it. On the application side you want to record the application, what the average processor utilization is and, if possible, what utilization is at its peak and at what frequency and time does that peak load occur. Don't just measure utilization though. Be sure to also measure it as a percentage of CPU capacity.
For example, if you have an application that is using 10% of a 3 Gigahertz processor it really is only using 300 Megahertz. Also make sure that the inventory accounts for 32-bit or 64-bit processors. What you want to capture as accurately as possible is a measurement by application workload (what the utilization is as it relates to the application), not the capacity of the processor.
Depending on your operating system and the resources available, you may have to check in regularly when you assume these peak load times occur. Additionally, note how much memory is required and used by the application as well as develop a sense of the storage and network I/O bandwidth required. The tools available may limit how accurate this measurement is. Sometimes the rather vague measurements of heavy, medium, or light are all that can be captured. Again while not a perfect science, it is certainly a start.
The next step in the inventory process is to account for your available physical compute resources by analyzing your servers. A shortcut here is to only inventory servers that you are going to consider as part of your virtual infrastructure. There certainly is the risk that you might miss a heavily underutilized server so you will need to make the decision between completion time and not wasting any resources. This is an ongoing process; an acceptable goal may be to get the initial work done first on the obvious servers, and then broaden the scope of the inventory over time. Unlike the inventory of applications, compute resource inventory requires you to capture the raw capacity in terms of number of processors, number of cores, processor speed and physical memory available. In addition, capture the I/O capabilities of the server, number of network interface cards and the number of storage interface cards (if any).
How and where to consolidate
As discussed in the first paragraph, where you happen to be in the server virtualization process will determine part of the next step, but the basics of allocation are the same.
If you have not virtualized yet, you want to begin mapping applications to available compute resources. Most virtualization processes start with a few new physical servers and then will add older servers as VMs are created and workloads are shifted off of them. Also, most virtualization projects will start by virtualizing low-importance, low-workload applications that have minimal I/O requirements. As a result these initial installations go surprisingly well.
As the virtual infrastructure matures and more servers are added to the environment, having the capacity inventory in place allows for some organizations to move further into the process. Capturing the inventory is step one, but additional data for each workload has to be captured as well. Will it be an application that you want to do virtual server migration on? If so you should assign its primary migration target right then and allocate, at least logically, the resource requirements that the shifted workload will consume. You don't want to have a situation where the movement of an application workload due to a physical server failure will cause the migration target to also fail because of unavailable resources.
While this allocation does lessen some of the benefits of consolidation, it does not drive the project to the point that the cost savings of server consolidation are eliminated. Servers are so heavily underutilized that accounting for twice the compute allocation still does not typically touch the amount of available resources. Add to it that server virtualization enables further enhancements, like infrastructure virtualization, that delivers the ability to power on and allocate additional compute workloads as described in a prior SearchServerVirtualization.com article.
Capacity planning and DR
If you are leveraging disaster recovery (DR) as part of your virtual infrastructure, you need to make sure you allocate the appropriate resources there as well. You do not want to be in a situation where a disaster has been declared and you don't have enough horsepower at the DR site to drive all of the virtualized servers. As part of the capacity plan, you should indicate whether a server is part of the DR plan and what its criticality in a disaster would be.
In a virtually leveraged DR setup, there are typically three areas that you will have to capacity plan for. As a safety measure, I would have all virtual images available offsite. That being said, the first group is pre-allocated using a DR tool so that the moment there is a site failure, these servers in the DR site take over almost instantly. The second area is virtual servers pre-allocated for spin up, but you should do so manually as time allows. The final area is virtual machines that are ready to deploy, but you are forced to wait until additional hardware comes into the DR site so they can be activated.
Be vigilant with capacity planning
The final step is to keep this capacity plan up to date. As migration to the virtual environment happens, it can be made more accurate as educated guesses can be compared against reality. Then as new workload requests come in or additional physical servers are examined for their virtualization readiness, they can be integrated into the capacity plan.
It's not just new virtual server requests or the addition of physical servers that should cause you to review the plan. As you explore new technologies like 10 Gigabit Ethernet cards, especially those with I/O virtualization, storage virtualization or the aforementioned infrastructure virtualization, you can work those into the plan as well. These technologies can further expand the consolidation effort or allow greater flexibility in consuming available compute resources.
ABOUT THE AUTHOR: George Crump is President and Founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.
This was first published in December 2008