Despite reality, vendors continue to push a fault-tolerant hardware-value message for server virtualization environments....
Some three years after predictions that server virtualization would drive widespread purchases of fault-tolerant (FT) servers, virtualization architects have yet to migrate to FT hardware en masse.
Fault-tolerant servers contain dual sets of hardware components -- such as memory modules and processor buses -- within one box and keep the two sides in sync within the same server chassis. The approach is analogous to the way clusters of multiple servers synchronously replicate data between nodes. If a component inside an FT server fails, its counterpart takes over and the server continues to run without having to fail workloads over to another host.
FT hardware maker NEC Corp. of America has made the "more eggs in one basket" value proposition for FT hardware in server virtualization environments. Last month NEC certified the fault-tolerant Express 5800 series R320a server with VMware vSphere. Company officials acknowledge that the initial price for fault-tolerant hardware remains higher than that for off-the-shelf servers -- though less than the $250,000 asking price for some FT hardware earlier this decade. NEC's $53,000 list price for the R320a still dwarfs an estimated $18,000 for a commodity equivalent. Stratus Technologies has also issued a press release on benchmark testing that, according to the company, showed "fault tolerant server hardware is uniquely able to provide the virtualized applications full access to the underlying hardware resources, including multiple CPU cores, for unhampered performance even in the case of catastrophic component failure."
Fault tolerant server cost analysis
When it comes to cost, vendors like NEC argue that that the cost of maintaining fault tolerance in one box long term is less than the cost of building out a networked infrastructure for multinode failover. NEC officials also contend that in some cases VMware's native FT feature may leave users vulnerable, such as in failing over host state information, along with guests.
In some shops, high-pressure use cases give FT hardware an advantage. Michael LaForge, a network administrator at Columbia Memorial Hospital in Albany, N.Y., said he used Stratus FT hardware when the hospital began hosting applications for outside physicians. "It put us into a different realm; we weren't just responsible for our clinical applications but for other doctors' applications," LaForge said. "At the time, ESX [VMware's virtualization hypervisor] didn't have high availability set up yet, so we went with the hardware fault-tolerance," which the hospital could also get up and running faster than a full virtual infrastructure.
It's also true, enterprise users say, that VMware FT is far from perfect, especially in its initial versions. "It's a 1.0 product," said Jason Boche, a senior systems engineer for a global media service. Today, VMware's mechanism for mirroring servers for failover, vLockstep, can be applied only to workloads using a single CPU; VMware's fault tolerance also can't be used within Dynamic Resource Scheduling [DRS] clusters.
But that doesn't mean mainstream enterprise users have embraced hardware-based alternatives, either. "People are going to be looking at this from a cost-effectiveness standpoint: Does it cost more to buy a couple servers that have software-based fault tolerance or a single server that has hardware-based fault tolerance?" Boche said. "I personally haven't seen a piece of equipment yet that's indestructible. In my opinion, anything you plug into the wall and that has electronic components in it is subject to failure."
For LaForge, FT hardware in the hospital's hosting environment hasn't been considered for the main enterprise's virtualization environment. Instead, the hospital built out a separate infrastructure to host virtual servers and VDI for internal applications, and uses VMware's HA features for redundancy. "If it's our hospital applications where we can schedule stuff, we want to have as high availability as we can, but depending on what the application is, if it can be down for a few minutes or long enough to migrate from one virtualized host to another, we would go that route with it."
While server virtualization has made its mark on enterprises' lower-hanging fruit, it has yet to permeate the fault-tolerance market. For example, GE Healthcare, one of the largest partners of Hewlett-Packard Co. for NonStop FT servers, does not yet support virtualizing its Centricity Enterprise electronic-health-record application. While virtualization is part of the long-term roadmap for GE, "[the application] is not designed to be virtualized, nor do customers demand it," according to Gregory St. James, the company's director of international marketing.
Anemic FT uptake
Market research statistics are not currently being publicized by major analyst firms about the rate of convergence between FT hardware and server virtualization. Still, analysts who track both markets say the overall trend brought about by data center server virtualization has been more commodity hardware rather than fewer, more hardened physical devices.
"The overall trend in system designs involves more assumptions that failures will be dealt with in software," said Tony Iams, senior analyst at Ideas International. "Software designs are starting to make the assumption that hardware will fail."
Such products may not cover every failure, said IDC analyst Gary Chen, but for many users looking to cut costs by virtualizing their data centers, "good enough is good enough."
Beth Pariseau is a senior news writer for SearchServerVirtualization.com. Write to her at mailto:firstname.lastname@example.org.