News Stay informed about the latest enterprise technology news and product updates.

Marathon beats VMware to fault tolerance for SMP

In the event of an outage, Marathon's everRun MX now keeps applications that use symmetric multiprocessing in sync for failover. But like VMware's FT, enterprises may approach it cautiously at first.

New software from Marathon Technologies promises fault tolerance for virtual machines that use multiple virtual CPUs, or symmetric multiprocessing. IT pros say the software may pave the way for virtualizing certain mission-critical applications.

Like VMware Fault Tolerance, or FT, Marathon's everRun software product line offers continuous availability for a virtual machine even in the event of a hardware failure. This is distinct from high availability (HA), which provides availability with minimal interruption. In the VMware universe, for example, VMware HA detects whether a virtual machine (VM) has gone down and reboots it on another host. In the case of FT, all aspects of the system and application are kept in lockstep, so that the identical secondary system can take over without interruption should the first fail.

Keeping two virtual CPUs in lockstep is difficult, and until now, both VMware FT and Marathon everRun have been limited to protecting VMs running on a single virtual CPU. This limits its use cases, since applications requiring FT, such as mission-critical databases, generally use more than one processor for performance. With the release of everRun MX, which can run on a Citrix Systems XenServer-based virtual appliance, Marathon says it has broken through the single-CPU barrier.

Meeting demand for fault tolerant systems
The availability of software-based FT has now pushed server virtualization into new, unreached corners of the world, such as the manufacturing facilities of PPG Industries Inc. in Oak Creek, Wis. The Pittsburgh, Pa.-based firm produces various chemical products such as paints and coating, and the presence of flammable solvents involves strict hardware requirements so that servers cannot create a spark.

This ... is something I've been asking them to do for at least the last five years.

Lief Morin,
presidentKey Information Systems Inc.

According to Mike Rische, a senior electrical project engineer, his company looked at fault tolerance as a way to keep the GE iFix WebSpace visualization platform used by manufacturing machine operators from requiring manual intervention in the event of a physical server outage. But other FT products came with their own hardware-that didn't meet their manufacturing floor criteria, while software-based FT wouldn't support iFix because it is multithreaded.

Rische said he signed on to beta-test everRun MX earlier this year. The software hasn't yet been put into production yet, since "we're the guinea pig for GE and Marathon to get their systems running together," he said. But so far, he says he's seen no degradation in system performance. "We're in the process of converting our processes over to the new system."

Channel partners say that multiprocessor FT is a long-overdue feature. "This step is something I've been asking them to do for at least the last five years," said Lief Morin, the president of Woodland Hills, Calif.-based systems integration company Key Information Systems Inc. But he noted that support for FT locally between multiprocessor systems is the first step. "The next step," Morin said, "will be to replicate to a secondary facility … and how we can geographically distribute [FT]."

Meanwhile, given that Marathon's technology is new, enterprises will likely be cautious in evaluating everRun MX, just as they have been with VMware FT. "I'm not using [VMware's] FT outside the lab environment. Its many requirements place limitations on use cases, one of the biggest and most obvious being that FT only works with single vCPU virtual machines currently," said Jason Boche, a virtualization evangelist and a VMware user at a large enterprise. Likewise, he doesn't think Marathon's product will fill the gap without due diligence. "[A] new product and technology, plus critical workloads requiring FT, equals a lot of testing needed [for it] to be blessed."

Intel, XenServer,Windows prerequisites
Marathon worked with Intel and networking vendors to create the architecture for everRun MX, but is keeping deeper details close to the vest. "We're leveraging the memory management infrastructure on the processors, but the secret sauce is all Marathon," said Jim Welch, the president and CEO of Marathon Technologies. The process will also support SMP systems that are nondeterministic, meaning all processors in the system aren't necessarily running identical portions of the workload.

Because this "secret sauce" uses components of Intel's processors, the software work only with Intel platforms in its first release, though Welch said Marathon is working on a similar product with AMD. The first release is also limited to the XenServer virtual appliance running Windows-based applications, though Welch said Marathon is working on Linux and VMware-compatible versions of the product.

Beth Pariseau is a senior news writer for Write to her at

Dig Deeper on Disaster recovery, failover and high availability for virtual servers

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.