KVM virtual disks: Files or raw devices?

Choosing the type of storage backend you want for your KVM virtual machines is simply a matter of weighing simplicity with flexibility.

When setting up a KVM virtual machine, the default choice for the storage back-end is to use a virtual disk file. This format is fine for many environments, however there are some advantages to using a raw storage device as the back-end instead. Let's take a look at the advantages and disadvantages to each approach.

Three reasons to use a virtual disk file

The first reason why you should use a virtual disk file as the storage back-end is because it's easy and convenient. Using a virtual disk file means that you don't really have to design anything with regard to storage. Just make sure that the host operating system has enough storage available and you're ready to go.

The second reason why using a KVM virtual disk file is good, is that it is portable. No matter what you want to do with the virtual disk, you can always copy the file over to another physical location. The ability to copy your file over to another machine makes your configuration versatile and adds flexibility.

The third reason why using a virtual disk file is good, is because it adds all the optimization options that Linux file systems have to offer. If you're anticipating heavy I/O in VMs for instance, you can choose a different file system than the default Ext4 file system that is used in most cases to add increased performance.

Using virtual disk files sounds good, doesn't it? Now let's have a look at why you may want to use raw devices instead.

Three reasons to use a raw storage device

When creating KVM VMs, many admins choose to use the Logical Volume Manager (LVM) as the storage back-end. This logical volume manager adds some good features that you may like.

The first reason why you should use LVM, is because of its flexibility. LVM was designed for easy resizing of storage volumes. That means an end to the problem of running out of disk space in the VM. With LVM, you can simply increase the size of the underlying logical volume and the VM can start using it.

A second reason why using LVM is good, is because of backups. To make a reliable backup, you need files that are closed at the moment the back-up is taken. The LVM snapshot features takes care of just that. Before making the backup, you'll take a snapshot of the logical volume that the VM is using. Next, you'll make the backup of the backup volume, which doesn't contain any open files at all.

Thirdly, using LVM makes it easy to set up a high availability environment where VMs can smoothly fail over to other nodes, even with live migration. Just set up the LVM volume as a clustered logical volume and the high availability stack will handle it correctly in any migration or disaster recovery scenario.

So which approach is best in the end? The most important benefit of using a file based storage back-end is that it is easy to use. If ease of use is not your primary design goal and you'd rather have improved flexibility, then use LVM logical volumes when creating KVM virtual disks.

Dig Deeper on Open source virtualization