Windows Failover Cluster: The importance of the coordinator node

The Windows Failover Cluster architecture is complex, but using the coordinator node to perform certain tasks can minimize the time sink.

I'm a big fan of Hyper-V but not its underlying clustering technology. The Windows Failover Clustering architecture was originally designed as general-purpose clustering infrastructure, but its management tool sets, such as Windows Failover Cluster coordinator nodes, are hardly a set-it-and-forget-it solution.

Hyper-V's clustered storage requirements for Live Migration bring the complexity of a Windows Failover Cluster into high relief. For virtual machine (VM) disk file storage, Hyper-V -- like its competitors -- requires a shared storage area network connection between virtual hosts. Unlike its competitors, however, Hyper-V has more than one shared-storage configuration: with or without Cluster Shared Volumes (CSV).

For more on the Windows Failover Cluster:
Live Migration in Hyper-V R2

Hyper-V failover
: Building a multi-site cluster

How to implement Cluster Shared Volumes in Hyper-V R2

This tip explains how Windows Failover Cluster works and how to give the coordinator node ownership of certain tasks to minimize their time sink.

How Cluster Shared Volumes works
To understand the coordinator node's role, you must understand CSV -- and to grasp CSV, you must also comprehend how a Windows Failover Cluster creates and works with disk resources.

Prior to the release of Windows Server 2008 R2 and CSV, disks that were part of a cluster had to be created as a disk resource within the Failover Cluster Manager console. In this configuration, each disk resource behaved as an individually managed failover unit. Disks, network names, IP addresses, and VM resources were linked together to create a chain of dependencies. When a problem occurred, this linkage allowed each VM resource to simultaneously fail over to another cluster node.

Without CSV, the cluster's failover unit was the disk resource. Setting the failover boundary as the disk resource failed over the disk data with the disk. Furthermore, any failover of the disk resources also migrated every VM on the same disk. F or most IT shops, this solution was not sufficient. To circumvent this problem, most administrators created a separate disk resource and logical unit number (LUN) for each VM.

CSV eliminates this issue by relocating the unit of failover to the files on a disk instead of to the entire disk. This process enables every VM's disk file to be the unit of failover. By placing VMs atop a CSV-enabled disk resource, it's possible to host multiple VM disk files on a single LUN and enjoy the benefits of individual VM failovers.

The role of the coordinator node
At this point, the Window Failover Cluster coordinator node becomes important. In a CSV-enabled configuration, individual files on a disk resource can be owned by different cluster members. But the disk resource that contains these files must also be owned by a cluster member. Microsoft calls the disk resource owner the coordinator node.

You won't use the coordinator node often. Nearly every VM-to-disk operation occurs from the cluster node that owns the VM directly to its disk. But certain operations must go through a coordinator node, such as copying Virtual Hard Disk (VHD) files to a LUN. This action tends to be disk-intensive, and it can take a long time.

Always copy VHD files to a LUN from the coordinator node. While any cluster node can transparently complete the task, the responsibility is handed off to the coordinator node. As a result, if you initiate the action on a noncoordinator node, it takes more time to complete.

You can alter which server operates as the coordinator node by changing the ownership of the disk resource. Normally, with CSV enabled, you won't necessarily need to do this task. If you have many file-copy operations, however, save yourself time by using the right node to start your work.

Greg Shields

Greg Shields is an independent author, instructor, Microsoft MVP and IT consultant based in Denver. He is a co-founder of Concentrated Technology LLC and has nearly 15 years of experience in IT architecture and enterprise administration. Shields specializes in Microsoft administration, systems management and monitoring, and virtualization. He is the author of several books, including Windows Server 2008: What's New/What's Changed, available from Sapien Press.


Dig Deeper on Disaster recovery, failover and high availability for virtual servers