Strategies for coping with and preventing VM sprawl

VM sprawl is a difficult problem to solve, but there are strategies administrators can use to help track down unneeded VMs and prevent further sprawl.

When it comes to server virtualization, VM sprawl can be one of the most difficult problems to deal with. Unfortunately, there is no easy answer for VM sprawl, but there are tools and strategies that when collectively implemented will help you to get a handle on server virtualization.

There are two main aspects to dealing with VM sprawl. The first of these aspects is purging existing VMs that are no longer needed. The second aspect involves preventing sprawl from recurring.

Getting rid of unneeded VMs

Weeding out existing VMs is by far the more difficult of the two tasks, especially in organizations that have large numbers of VMs. In order to eliminate unused VMs, you have to identify each VM, its purpose and its owner. This is a tall order. Some VMs are easy to identify, especially those that have been created recently or those that are running well-known mission critical services. Others are more of an enigma.

If your goal is to reduce sprawl, then I recommend starting by creating an inventory of current VMs. As you do, document each VM’s name, purpose, and owner. It’s a good idea to include contact information for the owner within your documentation.

More often than not, some of the VMs will be really tough to identify. I have heard of people identifying these VMs by “powering them down and waiting to see who screams.” I recommend saving this approach as an absolute last resort because it risks business disruption. A better approach is to look for clues as to the VM’s origin. Depending on the hypervisor and the management tools you are using, you may be able to determine when the VM was created, the VM’s domain membership, who created it, what services are running on the VM, the last time it was updated, the last time someone logged into the VM, and the level of CPU activity and storage I/O being generated by the VM. None of these things are likely to positively identify the VM’s owner, but they can provide clues as to why the VM was created and if it is still being used.

Prevent sprawl before it happens

Once you begin to make an impact on the existing VM sprawl, it is important to take measures to prevent sprawl from recurring. There are a few different things that you can do to keep VM growth in check.

My first recommendation is to accept the idea that there are costs associated with each VM. These costs might be real costs such as software licenses, or they might be intangible costs such as hardware resource consumption or management costs. Some organizations use chargebacks to bill VM costs to the department that owns the VM. Even if chargeback isn’t  a good fit for your organization, it can still be helpful to help users accept the idea that there is a cost associated with each VM.

Think back to the days when all workloads ran on physical servers. Nobody deployed a physical server on a whim because there was significant cost involved in server acquisition and deployment. Instead, most organizations required business justification before approving a new server. This same approach works well in virtual data centers. Requiring business justification before a VM is created can help to keep sprawl in check.

Another way to keep sprawl in check is to adopt VM lifecycle management policies. Some of the major virtualization vendors offer products that allow such policies to be integrated with the VM creation process. For example, this software might require the person creating the VM to enter information about who requested the VM and why. More importantly, the VM can be configured to automatically expire unless the VM’s owner explicitly authorizes a renewal of the VM lease. Some of the available products will automatically send an e-mail to the VM owner telling them that the VM is about to expire unless they renew it.

Yet another thing that administrators can do to control growth is to put resource quotas in place. Consider for a moment how mailbox quotas work. Administrators have long known that if left unchecked, some users will allow their mailboxes to grow to excessive sizes. The same basic concept also applies to VMs. Some people simply do not care how many resources they consume. As such, administrators may find it useful to implement resource quotas as a way of preventing excessive VM creation or overprovisioning.

Ultimately there is no single method that organizations can use to prevent VM sprawl. In almost every case administrators will have to deal with existing sprawl manually and then use an automated approach to prevent new sprawl.

Dig Deeper on Preventing virtual machine sprawl