Server cluster sizing: How many VMs is too many?

Even server clustering has its limits. Hypervisors and hardware can restrict how many hosts and VMs a server cluster can hold.

As server virtualization hardware becomes more powerful, hosting numerous virtual machines (VMs) in a server cluster becomes increasingly more practical.

Even so, one has to wonder whether it’s better to create one massive server cluster to store all your VMs or a series of smaller server clusters with a more reasonable number of VMs per host. Hardware and network connectivity capabilities can affect the size of a server cluster. Plus, hypervisors themselves come with server clustering restrictions.

Server clustering limitations

As you create a server cluster, the first step is to determine your hypervisor’s server clustering capacity. Every virtualization platform has limitations, so ensure that your planned server cluster falls within these boundaries.

A Microsoft Hyper-V server cluster, for instance, can contain up to 16 nodes (host servers), and each host can support up to 384 VMs with a total cluster limit of 1,000 VMs. VMware vSphere 4.1 can accommodate up to 32 nodes in a high-availability server cluster, with up to 320 VMs per host or 3,000 VMs per cluster.

Hardware considerations

It’s equally important to consider host limitations, because the hardware often limits the size of a server cluster -- most notably the number of physical network interface cards (NICs) installed in each node.

NIC requirements vary from one virtualization platform to another. But as a general rule, each cluster node requires a minimum of four NICs. One NIC manages the hypervisor, and the VMs don’t use it at all. Another provides network connectivity to the VMs. The third NIC connects cluster nodes to detect failures, and the fourth is dedicated to accessing shared storage devices via iSCSI or Fibre Channel.

But in many cases, a large server cluster requires additional NICs.  To provide redundancy and to avoid a single point of failure, you would typically provide extra NICs for VMs to use .

This approach can get tricky, however, because server hardware generally supports a limited number of NICs. Sure, you’ll get more bang for your buck with multi-port NICs or NICs that support 10 Gb speeds. But ultimately the number of NICs that each cluster node can accommodate will be a limiting factor.

Suppose you’re creating a Hyper-V server cluster and want to achieve the maximum density of 384 VMs per cluster node. To provide adequate performance, odds are you’ll need more network bandwidth for VMs than what a few 10 Gb NICs can deliver.

Other network considerations

Limited network connectivity can cause other server clustering issues as well. In virtualization clusters, all VMs’ virtual hard drive files must reside on shared storage that is accessible to all server cluster nodes. The problem is, if you connect to the shared storage through a single Fibre Channel adapter or a single NIC (using iSCSI), that adapter can become a single point of failure.

You can always use multiple adapters to provide redundant connections from a server cluster node to a shared storage device, but doing so can have consequences in large server clusters. In 2010, Hewlett-Packard Co. even issued an advisory, stating that failures could occur if the HP P4000 SAN was used in large Windows Server 2008 or Windows Server 2008 R2 clusters.

This server clustering problem stems from the number of iSCSI sessions that the device can accommodate. If cluster nodes, multipath I/O NIC ports and storage nodes collectively result in more than 31 iSCSI connections per volume, then failures can occur. When you consider that a VMware ESX cluster could potentially consume 32 iSCSI connections even without redundancy, you begin to see why this is such a problem.

HP released a patch that increased the iSCSI session-per-volume limit from 31 to 64, but even this limitation can prove to be problematic. After all, many organizations store multiple virtual hard drives on each logical unit number (LUN) to avoid having to manage an excessive amount of LUNs. As a result, a single VM can potentially establish multiple iSCSI connections to a single volume.

The question of whether it’s better to create a single large cluster or multiple smaller server clusters comes down to your infrastructure’s hardware capabilities. Even though each hypervisor has server cluster sizing limitations, hardware constraints often prevent server clustering limits from being practical.

More server clustering resources

  • High-availability and clustering solutions for vSphere VMs
  • Reserving cluster resources to boost high availability
  • High-availability clustering in virtual data centers
  • Tips for building a Hyper-V high availability cluster
  • Server cluster high-availability gotchas

Dig Deeper on Virtual machine provisioning and configuration