Published: 03 Mar 2008
Ongoing maintenance and upgrades to host systems can be a challenge to the virtualization administrator. Making significant changes to a virtual environment is a tactical process that requires planning Whether its the additional storage or enhancements to virtualization platform operating systems, this tip offers a managed upgrade strategy for eventless implementations.
The system landscape is always in a state of flux. Changed your mind on your storage connectivity, or added network interfaces for additional connectivity? Increased memory on your host systems, or rethought your virtual machine provisioning process? The dynamics of system administration are multiplied by virtual machines (VMs).
What makes virtual environments unique is that their scope is increased from the host system perspective. In my environment, each virtual host system currently houses around 25 virtual machines. Therefore, if maintenance is required on a host system, accommodations need to be made for those systems. Migrating virtual machines to other hosts to perform maintenance is a key piece of functionality for successful ongoing maintenance tasks. With a host system exclusively available for maintenance, the following strategies can be employed to deliver the best availability.
Plan for N+x host systems
If you need N number of virtual host systems, having x additional systems allows for that many systems to be in maintenance mode. If removing one virtual host system from your environment puts processor and memory usage above 100% in either category, maintenance becomes a complicated decision about what virtual machines need to be taken offline. Also, having at least one more host than the planned implementation will cover the virtual environment in the event of a critical system failure. While there is an increased cost for effectively unused capacity from the virtual host environment, having the option to take one offline is critical to account for host failures, implement enhancements and perform regular maintenance.
Reserve the right to add storage any time
Adding shared storage to a virtual host system is generally a non-event. Depending on the growth mode of a virtual environment, storage may be added regularly. Being able to make a host available for maintenance, I have taken the stance that a virtual host can go into maintenance mode as needed when adding storage. I add storage from the storage area network (SAN) in maintenance mode on the virtual host system. Once the logical unit number (LUN) is received by the first host system, the process of entering maintenance mode is repeated on subsequent host systems and the LUN is introduced. This process may seem cumbersome but performing this on a regular basis, even for mundane tasks, results in the following benefits:
- Familiarity with entering maintenance mode
- Confidence in the process, knowing that it will work
- Issues with virtual machines not migrating are identified in a non-critical situation
Host software upgrades
Implementing new versions of virtualization host software requires testing, and mixed-version migrations are a smart way to go about this process. Take, for example, a collection of eight virtual host systems. All the host systems cannot be upgraded all at once. Proper pre-implementation testing of a rolling mixed-version migration from the older host operating system to the newer host operating system is the way to proceed. Simply testing the new version of the virtual host operating system is not enough.
Update documentation in advance
Another key component to a successful administration strategy is keeping documentation current. Here's an example: I am currently in the middle of a significant enhancement to my virtual environment. I'll add a four port network interface controller to each virtual host, three additional VLANs and network addresses to the host systems, plus I'll reposition cabling to the hosts. These tasks individually are not overwhelming, but as a whole they constitute significant changes to the virtual environment when performed all at once. And it was important to update our internal virtual host documentation during the planning phase of these enhancements to ensure that our standing redundancy rules are met. One of these rules is that each group of connectivity is to be across at least two physical network interfaces on at least two separate controllers. In this way, if a cable becomes disconnected, a physical switch port goes offline or an entire network controller fails there is an additional path maintained. This configuration is used in fiber channel storage controller connectivity as well.
By updating our internal documentation, which includes a map of connectivity to all network interfaces that are connected to the virtual host environment, we are able to visualize the proposed configuration. We have had success in this situation by updating the documentation (as a revision update) of the virtual environment in advance as a starting point to making a procedure for the changes to be followed during implementation. By updating the documentation first, a controlled review of the connectivity changes is agreed upon by the entire staff and there is no confusion of specifically how the enhancements are to be rolled into the virtual environment.
Implementing virtual environment enhancements
Once the parameters for performing maintenance and enhancements are laid forth, proper implementation should proceed flawlessly. Some virtualization administrators suspend certain configurations as part of a maintenance mode. For example, in VMware environments you may wish to make the Distributed Resource Scheduler (DRS) configuration less aggressive. Depending on your utilization, the moment you exit maintenance mode, DRS may automatically migrate a series of virtual machines to the virtual host recently made available. While in maintenance mode, certain tests cannot be performed with live virtual machines. To address this issue, have a virtual machine available simply to test connectivity and functionality from the virtual machine perspective. When the DRS rules are relaxed, this single virtual machine can provide adequate testing in a live situation outside of maintenance mode. Reinstate the previous DRS configuration once the maintenance or enhancements are completed.
Maintenance should not be unattainable
Virtual environments require more planning prior to performing maintenance or enhancements. The silver lining is that critical failures become more manageable to the administrator who is familiar with the maintenance process. Of course, if your organization requires change control procedures or notification to other groups, these should be followed.
About the author: Rick Vanover is an MCSA-certified system administrator for Belron US in Columbus, Ohio. Rick has been working with information technology for over 10 years and with virtualization technologies for over seven years.