Hyper-V cluster performance problems and how to fix them

Hyper-V cluster performance problems and how to fix them

This four-part series focuses on improving Hyper-V cluster performance. Part one covered how firmware, drivers, patches and updates affect virtual host cluster stability. Part two offers personal workarounds to two Hyper-V problems that have helped the overall stability of my virtual environment. Here, in part three, I present more personal fixes to address Hyper-V cluster performance issues.

Hyper-V cluster performance issue No. 3: Volume GUID changes
Because of the natural growth of workloads, sometimes it's necessary to modify the logical unit number (LUN) size where the VM resides. After extending a LUN in a Hyper-V cluster, however, the volume GUID can change. This causes Quick Migration problems and will display an "unsupported cluster configuration" in System Center Virtual Machine Manager (SCVMM).

Figure 1
(Click image for an enlarged view.)

This problem occurs because the LUN has changed its volume GUID, but the Hyper-V setting has the old volume GUID.

Figure

    Requires Free Membership to View

    When you register, my team of editors will also send you the latest expert resources covering all areas of server virtualization, such as platforms, architectures and strategies, server hardware, managing virtual environments, application issues and more.

    Margie Semilof, Editorial Director

    By submitting your registration information to SearchServerVirtualization.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchServerVirtualization.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

2
(Click image for an enlarged view.)

Figure 3
(Click image for an enlarged view.)

In most cases, the VM runs fine on its cluster node. When attempting to move the VM to another node, however, the LUN will fail to mount. Eventually, it will return to its original node before attempting to move.

Once a VM enters this state, there is a creative workaround that involves shutting down the VM and using the cluster.exe command to re-register the VM's configuration. I have used this method with some success. Generally, though, I shut down the VM in Failover Cluster Manager, delete it in Hyper-V Manager and re-provision the VM (pointing to the new volume GUID and attaching the existing Virtual Hard Disks). My method requires reconfiguring the VM's network settings, but it gets the VM running.

To prevent this from repeatedly happening, install KB970529 on every Hyper-V cluster node. This addresses the volume GUID changes, so you won't have to use workarounds to correct the problem. Unfortunately, it will not fix VMs already affected.

(Note: I use Hyper-V Manager for VM deletion, instead of SCVMM, because it does not delete the VM files.)

Hyper-V cluster performance issue No. 4: IT administrative errors
Some Hyper-V cluster performance problems are not the vendor's fault or the result of unexpected failures. At times, IT administrative errors happen, and you need to take the blame.

In Hyper-V R1, there are complex requirements for Quick Migration, such as having a LUN for each VM. In one cluster, I have more than 100 VMs, meaning there are more than 100 LUNs of varying sizes. On top of that, each LUN is presented to six nodes, so the VM LUNs can mount on any node.

A problem occurs, however, if a LUN isn't presented to every node. One time, I had a handful of VMs that would not move to a particular host. The host was new, so I thought there was a firmware or driver issue. A VM would go into a save state and un-mount the disk. Then, when the cluster tried to move the LUN to another node, it would fail and bounce to another cluster node.

After the firmware and drivers checked out, I investigated the configuration of servers. Ultimately, I had forgotten to present the older, existing VM LUNs to the new host. Because there wasn't a Fibre Channel path to the VM LUNs, the new node could not mount the LUN.

Luckily, this issue has been resolved with Hyper-V R2's Cluster Shared Volumes (CSV) or through the use of a third party product like Melio FS, because these solutions do not rely on the one-LUN-per-VM architecture. Until there is a product that can catch everything that slips my mind, careful assessment and re-certification of virtual cluster environments after changes is necessary to prevent IT administrative errors.

Ultimately, for all the stability and redundancy that a Hyper-V cluster can add to a virtual environment, it does create a significant level of complexity as well. In my opinion, the trade-off is definitely worth it. But there are bound to be implementation shortcomings because of bugs or IT administrative errors. Knowing how to quickly stabilize your environment is a skill that needs to be developed.

In part four, I will focus on some strange virtual network issues and explain when it's necessary to take drastic action to recover from virtual network problems. Until then, send me any feedback or issues you have seen.

About the expert

Rob McShinsky is a senior systems engineer at Dartmouth Hitchcock Medical Center in Lebanon, N.H., and has more than 12 years of experience in the industry -- including a focus on server virtualization since 2004. He has been closely involved with Microsoft as an early adopter of Hyper-V and System Center Virtual Machine Manager 2008, as well as a customer reference. In addition, he blogs at VirtuallyAware.com, writing tips and documenting experiences with various virtualization products.


This was first published in December 2009

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.

    Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.