WavebreakmediaMicro - Fotolia


Create a high-availability strategy to prevent system failure

High availability is an incredibly valuable concept, one that helps safeguard systems and components against failure. Improve your high-availability strategy with these tips.

If you were to ask an administrator to describe his worst nightmare, he'd probably tell you it's a system failure. Outages are costly, because they waste both customers' and technicians' time, take up valuable resources and can result in a loss of business.

That's where high availability (HA) comes in. High availability ensures a system or service remains operational, even in the event of a system failure. HA can be achieved any number of ways, including through failover, a redundant array of disks, data storage, backup and more. When implemented correctly, it can give even the most restless administrator some peace of mind.

Figure out where HA fits into your infrastructure, how to create a high-availability strategy and how to test HA with these five tips.

Do you really need high availability?

There are a few situations in which the cost of HA outweighs its benefits.

Although HA is always useful, it isn't always necessary. In fact, there are a few situations in which the cost of HA outweighs its benefits. Though the functionality HA provides is valuable, products such as VMware High Availability and Distributed Resource Scheduler are expensive and can strain your organization's budget. In order to mitigate costs, before investing in an HA product, ask yourself, "Will high availability really make a difference to my bottom line?" You might be surprised: Some servers, including virtual desktops and servers with a long recovery time objective, don't really need HA at all.

Don't be afraid to be redundant

So, you've decided that HA is indispensable to your virtual infrastructure -- what now? Once you've implemented a high-availability strategy, the next step is to make sure it works to the best of its ability, and that means giving it a boost whenever and wherever possible. One way to ensure HA is to determine the right level of redundancy for your infrastructure.

In a nutshell, redundancy creates a backup of a component, increasing resiliency. Certain levels of redundancy, such as N+1, only provide one independent backup component, while more complex permutations, such as N+2+1, can sustain a greater number of host failures. By determining the appropriate level of redundancy, you can make sure all systems remain online in the event of a failure, therefore increasing availability.

Microsoft SCVMM makes availability easy

If you're a Hyper-V user, you already know the Microsoft hypervisor makes achieving HA easy. However, for administrators looking for a more tailored high-availability strategy, Microsoft has designed the System Center Virtual Machine Manager. SCVMM provides a few options for users to gain greater control over availability, including prioritization, and availability sets.

SCVMM's VM prioritization feature allows the user to indicate which VMs are the most valuable; these selected VMs are given preferential treatment in a failure situation, guaranteeing they remain highly available. Availability sets come in handy when working with guest clusters, because they let you create partitions between guest cluster nodes. This prevents the guest cluster nodes from being taken out by a host-level failure and protecting the highly available VMs within each guest cluster.

Guest clustering adds protection

A guest cluster is a type of failover cluster consisting of two or more VMs grouped together, and it's designed to implement HA at the VM level by safeguarding against system failure. Guest clustering utilizes Cluster Shared Volume, a shared storage volume that allows each guest cluster node to access the same storage resources. If one guest cluster node should fail, another will take its place; since each node shares the same resources, this allows business to carry on as usual, eliminating costly downtime.

Guest clustering can also protect your VMs from physical host failure, because you can place VMs that are part of one guest cluster on multiple physical hosts. Again, should one physical host fail, the guest cluster will detect the failure and start the clustered role -- a clustered application deployed within a VM -- on another physical host, protecting the workload and ensuring availability. Creating guest clusters in Microsoft Hyper-V is a great tool for your high-availability strategy and can add a valuable layer of security against service interruptions.

Testing high-availability servers

HA is an asset to any virtual infrastructure, but maintaining it can be a real headache. Administrators face the difficult decision between regularly testing server HA to make sure everything's running smoothly -- and risking service disruption in the process -- or putting faith in server HA reliability and risking potential failure.

There are a few considerations every admin should take into account before making such a decision: How much of an effect will an HA system failure have on your business? How much of an effect will a failed live test have on your users? What level of uptime do you require for your HA servers? The answers to these questions can help determine how frequently you need to test server HA, and how you should go about conducting those tests.  

Next Steps

VM clusters and HA eliminate need for disaster recovery

Is your data center ready for high-performance computing?

Guarantee availability with redundant cloud storage 

Dig Deeper on Disaster recovery, failover and high availability for virtual servers