As part of its Virtual Datacenter Operating System (VDC-OS), VMware plans to deliver VMware Fault Tolerance, "zero downtime, zero data loss protection for all applications without the cost and complexity of clustering," VMware reported.
VMware Fault Tolerance works by taking a VM and by making "an instruction mirror of the first" on a different physical box, explained Raghu Raghuram, VMware vice president of products and solutions.
That way, if the physical box experiences a failure, the mirrored virtual machine can continue normal operations. That's in contrast to VMware High Availability, which will reboot a VM in the event of a hardware failure.
"Think of it as applying the notion of VMotion to unplanned downtime," Raghuram said.
At VMworld last year, former VMware chief scientist and co-founder Mendel Rosenblum first demonstrated what would become VMware Fault Tolerance as well as how it would provide ordinary VMs levels of fault tolerance that have traditionally been reserved for only the most expensive server systems.
Currently in private beta, VMware Fault Tolerance will be production-ready in 2009, Raghuram said.
Citrix, Marathon team up
Meanwhile, in March 2008 Citrix announced its own fault-tolerance capability , which uses software from Marathon Technologies Corp..
Littleton, Mass.-based Marathon, a provider of fault-tolerant, high availability (HA) software for physical and virtual servers, also assisted Citrix in developing XenServer HA capabilities for the upcoming release of Citrix XenServer 5..
Through the combination of Citrix XenServer 5 and Marathon's everRun VM, Citrix XenServer Enterprise and Platinum Edition customers can get auto-restart high availability as a standard component at no extra charge.
"It is a failure detection technology that will move VMs as necessary to avoid downtime," said Jerry Melnick, the CTO from Marathon Technologies.
Figure 1: Maintaining the availability of Citrix XenServer VMs by creating redundant VMs
New features of everRun for XenServer 5 include support for more than two nodes in a pool, Melnick said. Marathon's everRun VM will also be compatible with OEM editions of XenServer software, he said.
Other benefits of the XenServer HA and everRun VM integration include the following:
- Streamlined setup and installation. IT administrators can install additional levels of availability from everRun VM within minutes.
- Seamless integration between different levels of availability. When administrators add everRun VM, it automatically identifies XenServer HA and provides them with a choice of availability levels.
- Simplified availability management. Configuration and management of all three levels are easily done from the everRun Availability Center (EAC) Web-based management interface.
EverRun VM comes in three levels.
- XenServer HA Level One: Failover High Availability, comes standard with XenServer 5 Enterprise and Platinum Editions, and offers failure detection and auto-restart and failover capabilities.
- EverRun VM: Level Two: Customers get component-level fault tolerance.
- The everRun VM Lockstep Option, Level Three: This third level offers system-level fault tolerance for the most critical applications with a requirement for zero downtime and zero data loss, which will be available Q1 2009.
Both the everRun VM availability upgrade for Citrix XenServer 5 customers and the everRun VM bundle (XenServer 5 Enterprise Edition and everRun VM with level-one and -two availability) will be available Oct. 30, 2008. In North America, suggested retail pricing is per server for Marathon everRun VM is $2,000 per physical server. When bundled with Citrix XenServer 5 Enterprise Edition, pricing is $4,500 per physical server.Fault tolerance goes soft
The need for fault-tolerant systems is nothing new. "Fault tolerance is essentially for customers that can't afford even short outages because they cost too much. This can be cost in monetary terms--such as casinos or financial institutions. Or it can be cost in human terms--such as medical or emergency dispatch systems," Illuminata Inc. analyst Gordon Haff said.
Fault-tolerant systems have long been offered by companies like Hewlett-Packard Co. and Maynard, Mass.-based Stratus Technologies, but Haff said all fault tolerance is moving in the direction of focusing entirely on software. For example, Stratus now has a software-based fault-tolerance product called Avance. The reason? "It's just too slow and expensive to develop [fault tolerant] customer hardware," Haff said.
Furthermore, fault tolerance becomes particularly important when many VMs reside on a single physical server.
"As people move into virtual environments, they are more concerned about downtime events," said George Hamilton, director of enterprise infrastructure at the Boston-based technology research and consulting firm Yankee Group Research Inc.. "When one physical machine goes down, it affects many virtual machines running on it, so you want to make sure that infrastructure does not fail. If you are going to put all of your eggs in one basket, you'd better make sure that it is one hell of a basket."
Click here to return to our VMworld 2008 conference coverage.