Virtual lab management planning best practices

Given the low risk, companies often embark on virtualization projects by virtualizing a test and development environment. But even virtualized lab environments can pose problems.

It’s little wonder that IT shops often begin a foray into virtualization by deploying the technology in a test and development environment. Not only is a virtual lab a low-risk environment in which to begin a deployment, it offers other benefits, including automating lab system provisioning and reducing the number of servers needed to run a lab environment. Still, virtualizing test environments comes with some important caveats, including virtualization and lab management platform nuances, backup and recovery concerns, networking considerations, and high availability issues. This article outlines the benefits of and foremost considerations in virtualizing test labs.

Since the advent of x86 virtualization, deploying virtual labs to support development, test and training has been the most frequent starting point for organizations headed down the virtualization path. Virtualizing a test lab is an easy choice, because you don’t have to risk issues with production resources and infrastructure management benefits from virtualization’s numerous values, including the following:

  • Self-service system provisioning
  • Virtual machine (VM) mobility across physical hosts
  • Single-image management

In many lab environments, system provisioning can be extremely time consuming. Often multiple images must be maintained for each given OS, depending on differences in physical system hardware in a lab. Even with imaging software, creating a new lab system can take anywhere from several minutes to several hours. If you add in custom network or storage requirements, the provisioning time only worsens. In addition, IT staff is almost always involved, which adds to the cost of ownership of a lab environment and typically slows the agility of test, development and training operations.

A properly deployed and managed virtual lab can eliminate the traditional pain points associated with IT lab management. For starters, users can self provision lab resources within minutes. Users can also share clones of lab environments with one another, which is especially helpful should they discover a problem that needs analysis by several sources or should a complex training or test environment need to be duplicated and shared among several users.

The key to virtualization’s benefits in a testing environment is abstraction. The abstraction provided by server virtualization removes the traditional hardware dependencies that complicate lab system image management. Instead of dozens of images required for each OS or application type, single gold (or base) images are all that’s necessary. In addition, abstraction enables VMs to move between physical hosts as needed. This level of mobility eases the dependencies associated with running a lab environment; instead of having to duplicate specific hardware platforms, in general, all that is required is consistency with hypervisor selection. Of course, you still can’t mix and match Intel and AMD-based hardware because of their processor differences combined with the fact that the hypervisor does not fully virtualize a CPU, but there is little problem mixing white-box, Hewlett-Packard based and Dell-based Intel platforms in the same lab environment.

Platform selection

While there are obvious pluses to virtualizing lab environments, there is plenty about virtual lab management that isn’t as obvious. When it comes to managing a virtual lab, you have two choices:

  • Do-it-yourself methods
  • Packaged software

Do-it yourself methods: Dating back to 2000, organizations with which I have worked used tools like Vmware Inc.’s Workstation to stage and manage single-host development and test environments. Today, many IT shops still allow tools like VMware Workstation, VMware Player, Microsoft Virtual PC and Sun VirtualBox to run locally on user systems.

While at first blush such a policy may not seem like a big deal, enabling users to locally run VMs on their desktop or laptop systems may also introduce unmanaged server OSes to a network infrastructure.
You can download practically any OS as a VM appliance from VMware’s Virtual Appliance Marketplace, and while that’s great for quickly evaluating software, you need to be careful about allowing servers that leverage well-known administrative accounts and are typically not at the most recent patch level to connect directly to a production LAN. I don’t know of an organization that has a policy in place that allows users to build white-box servers at home, bring them to work and plug them into a LAN, but I know of several that allow users to connect unmanaged VMs to a LAN, either directly or via the virtualization software’s network address translation, or NAT, feature.

The bottom line is that a do-it-yourself policy that enables users to locally run virtualization software on their systems has drawbacks, primarily for security and centralized management. When a lab environment is confined to a local system, there’s only so much you can do. A better approach is to centrally host a virtual lab environment on server systems running a VMware-, Microsoft- or Xen based hypervisor. Centrally locating a lab environment gives IT staff full control of VM images, allowing them to ensure that VM guest OSes are at the correct patch level and to enable compliance with an organization’s security policies.

Centrally storing the shared virtual lab environment on one or more physical servers also results in reduced storage requirements, compared with having users locally run virtual labs on their own systems. By centrally locating a test environment, VMs can be configured to share “golden image”—or master template—root disks, which could store a common Windows 2003 OS, for example. Application-specific disks can be linked to the root disk to build out an application-specific VM image.

Packaged software: Today, select lab management tools such as VMware Lab Manager and VMlogix LabManager support linked cloning on ESX. As you can see, linked cloning support is a key differentiator of lab management software. I always recommend virtual lab management software to clients because the self-service provisioning, ease of use and ease of management of these products make the investment an easy sell. Today, there are three major virtual lab management platforms:

  • Surgient Inc.’s Virtual Automation Platform
  • VMLogix LabManager
  • VMware Lab Manager

Microsoft’s forthcoming Visual Studio 2010 lab management will add yet another virtual lab management platform to the mix.
When selecting lab management solutions, I’ve already noted that select platforms support linked cloned virtual disks, which can save considerable storage space and provide centralized OS and application management. In addition, while cost is always a given, you will also want to consider other platform differentiators, including the following:

  • Supported hypervisors
  • Scalability
  • Virtualization management integration
  • User interface
  • Directory service integration
  • Policy-based VM lifecycle management and scavenging
  • Reporting
  • Customization capabilities

Thus, virtual lab management platforms offer several potential differentiators. A supported hypervisor is often one of the most important distinguishing features, especially for organizations considering using lower-cost hypervisors for some lab environments. Vmware Lab Manager, for example, supports VMware-based virtualization platforms, while competitors Surgient (VMware, Microsoft) and VMLogix (VMware, Microsoft, Citrix Systems Inc.) support multiple virtualization platforms.

Depending on the size of a lab environment, scalability is another concern. VMware Lab Manager, for example, integrates with VMware VirtualCenter Server, which supports a maximum of 2,000 VMs. In comparison, Microsoft’s System Center’s Virtual Machine Manager 2008 supports up to 8,000 VMs. Scalability is important, because large training and test environments may require multiple VirtualCenter servers.

You should also determine how well the lab management platform can integrate with your virtualization management layer (e.g., VirtualCenter, XenCenter or System Center), if at all. Beyond the virtualization management layer, consider whether you have the ability to integrate with enterprise orchestration tools from BMC Software, CA, HP, and IBM, and you should consider software test management tools from vendors such as Borland Software Corp. and HP.

The user interface is another important consideration, especially if your organization has users with different Web browsers as well as different client OSes (e.g., Windows and Mac). You should also evaluate how easily users can perform the following kinds of tasks:

  • Provision VMs
  • Share VM environments (several VMs) with other users
  • Switch between multiple VMs

A selected product should also integrate with your organization’s directory service (e.g., Active Directory) for user authentication and role assignment.

In lab environments, it’s easy for VMs to be rapidly provisioned and forgotten about, so it’s important for a lab management platform to have a solid policybased system in place to scavenge VMs that are no longer used. Otherwise, storage growth can quickly get out of control, adding unnecessary cost to a lab environment’s primary storage requirements, as well as adding excessive costs to supporting operations such as backup. Chargeback, compliance and security audits also require a solid reporting engine. Because of the way in which organizations need to tag lab resources often varies, customization is another important factor.

Deployment considerations

Users often ask me about isolation of the lab environment, and I still recommend physical isolation. Sure, you can mix production and lab environments on the same cluster and isolate them via separate virtual networks. But in such scenarios, you’re literally just a click away from disaster. A few years ago, for example, I watched an administrator corrupt his production Exchange Server information store by making a small network error (forgetting to change the host’s file settings prior to testing a restore).
Without physical isolation between production and test environments, other mistakes are possible. In my view, the piece of mind you get by physically isolating development, test and training networks is worth the added cost.

Keep in mind that integration and staging testing of production resources should be as close in configuration to the production environment as possible. So if you’re running a Web server farm on a VMware Virtual Infrastructure 3.5- based virtual infrastructure, your staging environment should include the same physical and virtual characteristics (e.g., clustered ESX hosts, High Availability and Distributed Resource Scheduler [DRS] enabled, VirtualCenter Server and so on). A training lab may not need to duplicate the production environment, and thus using a hypervisor from another vendor may be a consideration.

When architecting a virtual lab environment, storage is another major factor. Considering the high degree of file redundancy in a typical lab environment, storage arrays that offer thin provisioning and data deduplication can result in substantial storage savings. A given training environment, for example, may use 1,200 Windows Server 2003 VMs and 450 Red Hat Enterprise Linux 5 VMs. Deduplication reduces the amount of storage required by the redundant OS files in each VM. Alternatively, remember that you can architect a VM environment using linked cloned disks, so one golden Windows Server 2003 base image could be linked to 1,200 unique VMs, for example.

Backup and recovery choices should be made based on the criticality of the lab data and may not lend to a one-size fits-all solution. The simplest approach is to leverage array-level snapshots to secure full-image copies of VM data. VMs requiring frequent backups can be stored on dedicated logical unit numbers, or LUNs, enabling you to set a different snapshot schedule for a select VM. Note that many arrays do not coordinate snapshots with the VM guest and thus a VM’s applications may not be properly quiesced or its I/O cache fully flushed before the snapshot occurs. Without proper coordination, you’re left with crash-consistent snapshots, which are equivalent to hard-power failures. Typically, a VM guest OS and applications can recover from such an event. Still, if application consistency is required, you should leverage an array that offers application-consistent snapshots for your particular virtual environment, guest OSes and applications.

Alternatively, you could run backup agents inside VMs that require consistent backup. Such granularity is often not needed in test environments, but in some development or training environments it may be necessary. Because you need to ensure that the isolation mandated by your organization’s security policy is maintained and enforced within the virtual infrastructure, networking can pose another architectural hurdle. Ensuring this isolation may require separate VLANs and virtual switches by department or security zone, or it may require isolation by physical network interface. Complex test environments consisting of several VMs could span multiple physical hosts, so in such environments it’s important to ensure that VM-to-VM communication is maintained, along with proper security isolation. VLAN tagging is often sufficient; separate clusters, however, may be needed in cases where strict isolation is required for one or more security zones.

Availability is another important consideration. While some virtual lab environments can be hosted on single physical hosts, many require a high availability cluster. This is where hypervisor features such as high availability, live migration and dynamic workload balancing (e.g., VMware’s DRS) are beneficial.

You can also consider features like VMware’s Distributed Power Management (DPM), which shuts down unused servers when their resources are no longer needed to save on power and cooling costs. To date, vendor data is insufficient on the impact of dynamically shutting down and starting up servers on a daily basis on system or component mean time between failure, so for a production environment, there are no forms of dynamic power management that I would recommend today. In lab or training environments, dynamic power management technologies may make sense if you feel the good (i.e., improved energy efficiency) outweighs the bad (i.e., reduced server or component life).

The physical location of IT staff that needs to access virtual lab resources should dictate resource placement. Latency and security are often the greatest factors when it comes to resource placement. Many organizations, for example, can get by with centrally located hypervisors, virtualization management and lab management servers. On the other hand, the latency associated with unreliable bandwidth links and geographically dispersed locations calls for VMs, hypervisors and lab management servers that are housed in regional locations. In such cases, it’s important to select lab management software capable of scaling to multiple sites, with multiple management servers, each of which can be managed centrally at corporate headquarters.

Well worth the effort

This article may give you more to worry about than you planned, but that wasn’t my intent (well, maybe it was). As you select a lab management platform and architect the virtual lab infrastructure, you have to be careful. But that due diligence is well worth the effort. And indeed, while virtual test labs pose substantial management, security, storage, network and scalability issues, once you have an automated, self-service-driven lab infrastructure in place, you can practically sit back and watch it run. IT staff can focus on other issues instead of constantly imaging lab systems. Since creating new lab environments and taking snapshots of existing environments becomes so easy, you’ll likely find that users will employ lab resources more than ever before.

In the physical world, performing a risky test could take down a system and set the tester back several hours, but in the virtual world a user can turn back the clock on his virtual lab and, within minutes, restart a test. The security risks associated with locally running VMs on user desktops and laptops alone should be justification enough to consolidate and centrally manage a shared virtual lab infrastructure. The automation and service-oriented architecture that results from a migration to a virtual lab management platform may initially seem like an added bonus. But soon you’ll wonder how you ever lived without these features.

About the Author

Chris Wolf, an analyst in the Data Center Strategies service at Midvale, Utah-based Burton Group, has more than 15 years of experience in the IT trenches and nine years of experience with enterprise virtualization technologies. Wolf provides enterprise clients with practical research and advice about server virtualization, data center consolidation, business continuity and data protection. He authored Virtualization: From the Desktop to the Enterprise, the first book published on the topic and has published dozens of articles on advanced virtualization topics, high availability and business continuity

Dig Deeper on Virtualized test and development environments