Brian Jackson - Fotolia


Choosing storage back end for open source virtualization environments

When setting up open source virtualization environments, users should consider the benefits of both VM files and physical devices for optimal storage back end.

When setting up virtualization, you'll need to decide how to handle the virtual machine (VM) storage back end. You can choose between using a VM file or a physical device as the storage back end. This article will provide an overview of the advantages and disadvantages of both methods.

For administrators used to managing VMware environments, the choice is obvious; in VMware, the default format is the virtual machine file system (VMFS). Those who primarily work with open source virtualization and open source virtualization environments, which do not support the VMFS, have a distinctly different perspective.

Even though open source hypervisors don't support the VMFS, there are still some clear reasons why it might make sense to use a file system storage back end. The most important of these is that a VM using a file as its storage back end can easily be migrated by copying the file over to its new destination location.

If your virtualization infrastructure goes beyond a single server, though, you'll need to make sure that the file can be accessed by multiple nodes simultaneously, particularly if you want to be able to use live migration. This is where open source virtualization doesn't provide one clear format that stands out.

There are three common file system approaches in open source virtualization environments. The VM's storage file can be created on a Network File System (NFS) share, a Global File System 2 (GFS2), or an Oracle Cluster File System 2 (OCFS2).

If VM disk back-end files are stored on NFS, the NFS server needs to be set up for high availability. This complicates the setup, which is why it's not ideal.

The remaining two file systems -- GFS2 and OCFS2 -- allow multiple nodes to access the file system at the same time and to share file locking information. While both work well, setting up cluster communication can be challenging. OCFS2 is the most common format in non-Red Hat environments, but needs a complete cluster stack. Red Hat's GFS2 is easier as it can do without a complete cluster stack, but it only integrates well with Red Hat, CentOS and Fedora Linux distributions.

Due to the absence of a standardized shared file system, the disk-based storage back end is often more appealing for open source virtualization. Disk-based storage back ends are normally based on the Logical Volume Manager (LVM). LVM can synchronize locking information between nodes in an open source virtualized environment and is available on all Linux distributions. This synchronization is achieved by running two components: the Distributed Lock Manager and the Cluster LVM daemon. These components also require access to a functional cluster, but are available on all Linux distributions.

A theoretical disadvantage of using disk-based VMs is that an LVM storage device doesn't copy as easily as a file. This doesn't have to be a real disadvantage, though. Linux administrators can use the "dd" command to copy block devices, which makes it relatively easy to move VMs among storage arrays.

Because file-backed VM disks on Linux lack a common standard, disk-based VMs are still common in open source virtualization. The advantage of using this approach is that the LVM technology, which is required to synchronize locking information, is available on all Linux distributions, whereas the backing solutions needed for file-based storage of VM disk files are not commonly available on all distributions.

Next Steps

Securing disk-based backup
Which file system is superior: VMFS or RDM?
Options for open source desktop virtualization software

Dig Deeper on Introduction to virtualization and how-tos