News Stay informed about the latest enterprise technology news and product updates.

Cluster expert Don Becker: Virtualization doesn't solve any problems

Don Becker, co-founder of the Beowulf project and CTO of Scyld, describes the shortcomings of virtualization and how it compares to clustering.

Virtualization addresses the same problems, such as server glut and management complexity, as clustering does -- and it doesn't necessarily do a better job of solving them, according to cluster expert Don Becker.

That doesn't mean that clusters and virtualization are mutually exclusive and can't benefit from technology sharing, says Becker, co-founder of the Beowulf project -- a pioneering open source/Linux cluster development effort. He is also the CTO at Scyld Inc., the software division of Penguin Computing, a server vendor in San Francisco.

In this interview with, Becker describes the shortcomings of virtualization, its kinship with clustering and the marriage of the two in Scyld ClusterWare HPC 4.0 cluster virtualization software. What's behind your cluster virtualization approach?

Don Becker: What we're trying to do with the clusters is make a unified view of all the machines, a single virtual image of the cluster presented to the end user as a single system. We do that by creating an overall virtual system. Our cluster software creates that single virtual view of a large set of machines.

Anybody can do virtualization if they're willing to do it inefficiently. The key is to figure out how to do it efficiently in order to lose no performance. In our case, you even gain performance over the cluster.

What's the difference between clustering and paravirtualization?

Becker: Paravirtualization is running multiple independent environments on a single physical machine.

The interesting thing about that is people are creating a cluster problem. They're creating the same issue that we've been dealing with for years in clusters, but they're just doing it on a single physical machine.

We already understand how to manage large numbers of machines. Now we have multiple virtual machines within a single physical machine. We're joining those together. We're essentially just dealing with larger clusters. The rest of the world is now learning to deal with multiple virtual machines. We're putting them all together so that you still have independent environments, but you have that viewpoint of looking at all of them as a single unified virtual machine.

So, do you relate the dangers of virtual machine sprawl to aspects of managing clusters?

Becker: One of the elements of the cluster problem is version skewing. One of the things we learned in the first five years of the Beowulf project is that everything's great when you first do the installation on all the machines. Everything is consistent.

But, over time, things drift. Versions drift. Unless you have an explicit mechanism to manage the evolving versions of applications and configuration files, you're going to have many different installs, each of which has slightly different versions and becomes impossibly difficult to manage.

People who run large machine rooms already know this problem. People who haven't been running large machine rooms are now going to have that same problem of managing large numbers of installations.

My viewpoint is that you should manage every similar installation as a cluster. It should be a single-point, single machine-like administration. You should have an explicit consistency model to handle that.

There are situations in which companies have multiple clusters that are managed individually. Why have clusters of clusters, and how does virtualization come in to make them more manageable?

Becker: When you have different installations or you have distinct libraries, you need to have installed for different applications. Those are two different installations that should be managed as different installations.

But, if you have a hundred machines that are all running similar installations, you should never think about managing that as a hundred separate installations, but instead as a single one.

With virtual machines, now you have to manage all of the contents of the internal machines. I say treat it like a cluster and manage it that way. If you do this right -- rather than having a full install on each virtual machine and not being able to chair those nearly identical installs -- and you manage them as special purpose application nodes as we do on our cluster system, you identify which libraries and executables are identical. Now you have the opportunity to share those virtual memory regions, even though accomplishing that involves some extra system work.

I've been hearing a lot about I/O problems that may occur as virtualization use increases. Do you think it will be a big issue?

Becker: Before, only people running computational clusters really worried about distributed file systems. Now, everybody running virtual machines has to think about it and has to worry about whether or not they need to distribute the file system.

If you run separate full virtual machines or paravirtualized machines, you do have that problem. File system consistency issues might drive people to use more of a container-based approach rather than running separate kernels inside each virtual machine.

But I don't see it as a crisis. It's the same kind of problem people running clusters have faced for years. Admittedly, it's not been solved very well up to this point.

Are you saying that virtualization isn't going to solve management complexity problems, which IT managers face as a result of server glut? Will it actually create more issues?

Becker: Virtualization doesn't solve any problems.

For example, some people talk about virtualization in terms of being able to predict a machine is about to fail and migrating that virtual machine off that physical machine. That completely ignores the I/O problem. If you have a virtual machine on the physical machine, you're probably using local file systems on that machine. If you migrated off that physical machine, you have to forward all of the I/O -- which is painfully difficult to do -- or have it already set up to use a network file system, remote I/O or network-attached storage.

A lot of people don't think this through and realize that you have to have one of those things, remote storage or network storage, if you want to migrate virtual machines. That's going to be a significant performance hit.

Dig Deeper on Reducing IT costs with server virtualization

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.