A major mistake in disaster recovery planning is to consider primary and secondary sites in isolation. Any effective business continuity strategy considers both disaster recovery sites and their resources in tandem.
In part one of this series on creating Microsoft Hyper-V disaster recovery sites, we covered various failures against which you should protect your virtualized environment. The next step -- architecting a Hyper-V disaster recovery site -- requires some careful planning.
Disaster recovery site storage and networking
First, you need another storage device at your disaster recovery site. Device size must handle the sum total of all the virtual machines (VMs) and their data. Note that you may not need this storage for every VM in your environment. You may not care about disaster-proofing some VM workloads, but a storage device must be able to handle every VM's additional disks and data, such as databases or mail stores.
You also need the necessary networking infrastructure so that the cluster equipment in both sites can communicate with one another. A site's networking technology must provide good-enough performance to ensure that replication can occur without much data queuing. Most of today's replication solutions providers can monitor the rate of change of virtual workloads and estimate the necessary bandwidth. It's important to perform these calculations during the planning process, or you'll find that your available bandwidth can't handle your replication needs.
Storage replication technologies tend to involve one of two mechanisms and are generally installed at one of two locations. With four possible combinations, choose the one that meets your needs. The list below summarizes each option to get you started:
- Synchronous replication. In this method, disk storage changes must be confirmed at both sites before a subsequent change is committed. This process, however, can slow disk operations (sometimes dramatically). Synchronous replication usually requires short distances between sites and extremely high bandwidth, but if your data preservation needs are high, this disaster recovery approach is for you.
- Asynchronous replication. Unlike synchronous replication, asynchronous replication allows a batch of changes to be queued and sent when conditions are optimal. When a primary site goes down, this method loses data but the loss tends to be nominal. The benefit is that you avoid the performance and distance limitations of synchronous replication.
- Installed to the storage. Two choices are also possible for the location where you'll host your storage replication software: on the storage itself or on a hardware appliance that works in conjunction with storage. The software usually exists on your hardware already, so you probably won't need to install it to these devices, but you will need to enable it. Be aware, however, that storage-based technologies can experience problems with VM or application corruption if they are not properly integrated into each VM's OS. But storage replication solutions that use host or VM-based agents can be important for maintaining application data integrity during replication.
- Installed to a host or VM. Alternatively, you can install solutions to a Hyper-V host or its VMs. These software-based technologies usually handle replication by inserting themselves into a host's or VM's file system. From there, the software can capture the changes to a host or VM's disks and shuttle them off to the disaster recovery site. This approach verifiably ensures data and application integrity, but it may not scale well, because it involves multiple clients on multiple hosts or VMs sending data over the wire to your disaster recovery site. Also, depending on the technology, this approach can cost more because some vendors charge by the replicated VM.
Always remember that Microsoft's built-in Distributed File System Replication solution is not compatible with Windows Failover Clustering. You need a third-party solution to accomplish the necessary replication.Choosing servers for a disaster recovery site
You need enough server equipment at your disaster recovery site to handle failed-over VMs. Remember: You also may not want to fail over every VM in a disaster situation, So start by considering at least as many Hyper-V hosts for your disaster recovery site as you have at your primary site. Then, work downward from this number as you identify which VMs you can afford to lose when a disaster hits -- but ensure that you plan for future expansion.
Finally and arguably mostly important, remember that a disaster recovery site is for protecting yourself from a disaster. This means that the distance between your primary and secondary site should be far enough apart that the disaster you're protecting against won't take out both sites simultaneously.
So, for example, if a bank in Kansas intends to protect against a tornado, it can probably put its disaster recovery site on the other side of town. On the other hand, a coastline business that wants to protect against a hurricane needs to look for options much further inland.
When determining the location of a secondary disaster recovery site, consider the networking and performance needs between the two sites. Closer sites can use less expensive networking solutions than those that are farther away.
Planning for a disaster is absolutely key, but what you really want is a fully functional solution for protecting virtual machines. Once your planning exercises are done, you can implement Microsoft's solutions for Hyper-V disaster recovery fairly easily. And that's the subject of the next two articles in this series.