Understanding the Virtual Machine Disk file format

Files in the Virtual Machine Disk (VMDK) file format, a central part of VMware's virtual environment, can act as complete and independent virtual machines.

Traditional non-virtualized servers and desktops will load and execute hundreds, even thousands, of individual files such as operating-system kernel files, device drivers, application components and data files. Virtualization abstracts the software from the underlying hardware, and places all of the constituent data for any given virtual machine (VM) into a single disk file. The Virtual Machine Disk (VMDK) file format is the disk-format specification used with VMware virtual machine files. In essence, a file with a .VMDK file extension is a complete and independent virtual machine using VMware virtualization products or other platforms that support VMDK files such as Sun XVM, QEMU, or VirtualBox.

There are pros and cons to this 'single file' VMDK disk. The principal advantages are simplicity and convenience. A single VMDK file, for example, is easy to move between servers using live migration features in the virtualization platform. Similarly, a single file can easily be protected with snapshots or continuous data protection (CDP) technology. Virtual machine files, in fact, are often copied to the SAN. There, additional resiliency practices such as replication to off-site disaster recovery facilities and RAID within the SAN-storage array can further protect the VMs. By maintaining VM files on a high-performance SAN, recreating VMDK files or restarting troubled VMs on other physical servers is a simple process. This can be crucial when a VMDK file is damaged or corrupted.

The biggest drawback to a 'single file' VMDK disk is the extra effort necessary for recovering lost data. It's impossible to recover only a part of the VM, such as a deleted Word document. The entire VM would have to be restored, usually to a spare server instead of the actual production server, and the missing or corrupted file could then be retrieved. It's much quicker and easier, in practice, to restore a VM than it is to restore a traditional backup, but administrators must still develop a sound process for data recovery from VM files.

Since the VMDK format is a central part of VMware's virtual environment, it's critical for any third-party provisioning, management and backup tools to be fully interoperable with disks running VMDK. Third-party developers and organizations developing their own custom applications for a VMware environment can employ the VMware Virtual Disk Development Kit. This includes a C library and command-line utilities that allow developers to create and access VMDK files.

The VMDK file format competes with the Microsoft Virtual Hard Drive (VHD) disk format used with Virtual Server and Hyper-V hypervisors. This can become problematic when moving from VMware to Hyper-V. Since the disk formats are not directly compatible, existing virtual machines cannot operate under a different hypervisor. In many cases, virtual machines would need to be translated onto physical servers first (V2P) to effectively remove virtualization. The new hypervisor would then create new virtual machines using the new disk format. Some third-party tools such as VMDK2VHD claim to convert VMDK to VHD files, smoothing the transition between hypervisors for IT professionals.

Dig Deeper on Introduction to virtualization and how-tos