hywards - Fotolia


A smart organization's guide to VM sprawl management

VM sprawl plagues the virtualized environments of many organizations. The key to overcoming it is incorporating automation and orchestration in your sprawl management plan.

The ease with which servers can be virtualized is a blessing and a curse. Unneeded VMs waste an organization's resources and create licensing complications. To combat these inefficiencies, IT teams should put controls in place and adopt sensible prevention strategies.

Many organizations have adopted sound practices for deploying VMs, but they have not succeeded in effective VM sprawl management. As a result, they aren't able to realize the full benefit of their virtualization investments.

Products exist to address this issue; some come from major virtualization vendors and others from third-party vendors. The most important tool, though, is a commitment to policies that require users to justify the need for a VM and the duration for it to be in place.

Automation improves configuration management

The temptation is to throw technology at the problem. Virtual server sprawl, however, isn't a technology problem as much as it is the result of broken or nonexistent processes. Many organizations simply overlook the basics of VM sprawl management when it comes to controlling server inventory.

A strong server inventory depends on having a complete and up-to-date configuration management database (CMDB). It helps to have an automated tool for this. Still, before implementing such a product, an organization needs to know exactly what it is tracking. Outside of the technical specifics of a server, such as a hostname, network addressing and OS version information, the CMDB should include metadata to identify server lifecycle.

A subset of metadata that's collected in an effective CMDB includes:

Using a combination of the above metadata allows for a broad range of policy-enforcement actions. For example, an operations team could create a policy that requires all test systems to be validated by the system owner every three months; if that validation doesn't occur, the resources are released back to the infrastructure pool. Once a process for collecting and maintaining system metadata is in place, IT infrastructure teams can introduce technology tools for VM sprawl management.

The most useful tools are those that help keep the CMDB updated, while enabling some level of policy enforcement. Technology can help, but only if you've established clear processes for notification and reclamation. It is much more critical to change the culture that creates virtual server sprawl.

Technology's role in battling VM sprawl

No one product or technology will be able to help you win the battle against VM sprawl, but certain tools can act as useful allies.

With new VM requests, many cloud-centric products include options for returning resources to a pool at the end of a virtual machine's life. VM life expectancy is a data point established at VM creation, and products such as VMware vRealize Automation will shut down or delete a VM at the expiration date.

In established environments, these cloud-focused products do little to address existing sprawl. Instead, you'll need a combination of tools to automate VM recovery. With VM decommissioning, IT teams need to understand the technical requirements. These are the steps to reclaim a VM:

  1. Identification
  2. Notification
  3. Approval/recertification
  4. Physical deletion
  5. Return of resources
  6. CMDB cleanup

To automate the process, an organization needs to invest in an orchestration tool. Most of these products will be up to the task. Options include VMware vRealize Orchestrator, ServiceNow's Orchestration and Microsoft System Center 2012 Orchestrator.

The orchestration tool would be used to control the workflow of several tasks needed to decommission a VM. It identifies a list of VMs eligible for decommissioning and runs a query against the metadata collected earlier in the process development. For example, a query would identify any VM without a valid recertification flag set.

Once you've identified potential servers for decommissioning, notifications can go out to the system owners to either approve decommissioning or to recertify the resource. If recertified, the VM remains active for another year. If not, the orchestration product instructs an automation tool to physically delete the VM. Once completed, the resources, such as IP addresses, return to the IP address management tool for reassignment.

This process demonstrates why virtual server sprawl is such a big problem. Automation can help only when an organization already understands its environment.

Critical system metadata, such as system owner and last date of validation, will help to identify which of your resources are recoverable. Smart organizations can then take advantage of this data and implement automation and orchestration tools. Other IT teams will find that they can combat VM sprawl by simply compiling periodic reports and manually reclaiming resources.

Next Steps

Five easy rules for VM provisioning

What should be the maximum number of VMs per host?

Avoid these mistakes when making a consolidation plan

Dig Deeper on Preventing virtual machine sprawl