Disaster avoidance and recovery strategies

Virtualization capabilities can enhance a company’s disaster avoidance and recovery process, but companies must weigh business needs to decide which approach is right.

In almost all modern deployments, disaster planning involves the interaction of multiple data center sites. The main difference is how those multiple sites coexist. Disaster recovery deployments normally rely on a passive secondary site to support the active primary site (called an active/passive configuration). This means the secondary (backup) data center does almost no work until disasters happen. Secondary sites can be cold, warm or hot – all relying on backups and snapshots to preserve critical workload data.

A cold site typically includes little (if any) hardware setup and ready to go, and may not host current backups. When disasters occur, it can take considerable time to setup, configure, restore and activate a cold site. Given the recovery needs of modern businesses, cold sites are rarely deployed. In a warm secondary site, remote servers are typically setup but left powered off, and only the storage arrays are working to receive backups and snapshots. When disaster strikes, the server hardware can be loaded with the latest backups or VM snapshots and then brought online. This means it takes far less time to complete a restoration at a warm site than at a cold site. A hot secondary site keeps servers on and synchronized with the primary server workloads (though they’re not doing any computing). Hot sites minimize recovery time objectives because the secondary servers are essentially ready to go, but it also incurs more expense to run the extra hardware and keep the workloads synchronized.

Disaster avoidance deployments generally adopt an active secondary site in conjunction with the active primary site (an active/active configuration). This means the secondary data center can host active workloads and share computing tasks in real time. Disaster avoidance deployments normally rely on VM migration and clustering technologies (such as Stratus Technologies’ everRun) along with highly resilient server hardware. When disaster occurs, the secondary site can continue working without perceivable disruption.

Disaster avoidance technology
There is no single suite of hardware or software technologies in a disaster avoidance deployment, but there are several important practices that are commonly adopted for avoidance purposes.

In terms of physical hardware, a well-implemented disaster avoidance infrastructure usually starts with enterprise-class, fault-tolerant, highly resilient clustered servers supported by uninterruptable power supplies, emergency generators and backup cooling systems. Network connectivity between servers and switches is usually redundant and supports trunking and failover behaviors. Storage arrays are normally mirrored or replicated between sites. Disaster avoidance is also enhanced with carefully-designed physical security measures to prevent theft or damage to facilities and systems. For example, a flood sensor or fire alarm can trigger a failover to a secondary site long before any actual disruption occurs – mitigating the possibility of data loss.

Disaster avoidance deployments also rely on virtualization platforms for fast, hardware-agnostic workload provisioning and convenient migration, as well as comprehensive workload failover software, which maintains synchronization between duplicate VMs across data centers and supports failover when one of the workload iterations are disrupted. Such VM clustering can also determine when failed hardware is restored and re-balance workloads to optimize performance.

Remember that disaster preparedness and deployment decisions are not a one-time endeavor. It’s important to revisit disaster plans periodically and re-evaluate those plans against ever-changing technological, competitive and regulatory landscapes. For example, organizations with tightening recovery point and time objectives and regulatory demands might take advantage of improving Internet bandwidth and virtualization tools to implement an avoidance infrastructure which may have been technically impractical (or prohibitively expensive) just a few years ago.

Dig Deeper on Disaster recovery, failover and high availability for virtual servers