Problem solve Get help with specific problems with your technologies, process and projects.

What to consider when designing virtual hard disk (VHD) storage

A key performance concept is VHD file placement. In this tip, Anil Desai looks at some scenarios and recommendations that can have a significant impact on performance.

Much of the power and flexibility of virtualization solutions comes from the features available for virtual hard disks. Unfortunately, because so many different configuration types are available, you can end up reducing overall performance if you're not careful.

A key concept is virtual hard disk (VHD) file placement. Let's look at some scenarios and recommendations that can have a significant impact on performance.

Note: For an introduction to working with Virtual Server's disk architecture, see Understanding Virtual Hard Disk Options.

VHD file placement
Most production-class servers will have multiple physical hard disks installed, often to improve performance and to provide redundancy. When allocating for VHDs on the host's file system, the rule is simple: Reduce disk contention. The best approach requires an understanding of how VHD files are used.

If each of your VMs has only one VHD, then you can simply spread them across the available physical spindles based on their expected workload. A common configuration is to use one VHD for the OS and to attach another for data storage.

If both VHDs will be busy, placing then on different physical volumes can prevent competition for resources. Other configurations can be significantly more complicated, but the general rule still applies: try to spread disk activity across physical spindles whenever possible.

Managing undo and differencing disks
If you are using undo disks or differencing disks, you'll want to arrange them such that concurrent I/O is limited. Figure 1 shows an example in which differencing disks are spread across physical disks. In this configuration, the majority of disk read activity is occurring on the parent VHD file, whereas the differencing disk will experience the majority of write activity.

Of course, these are only generalizations; the size of the VHDs and the actual patterns of read and write activity can make a huge difference.

Figure 1: Arranging parent and child VHD files for performance

In some cases, using undo disks can improve performance (for example, when the undo disks and base VHDs are on separate physical spindles). In other cases, such as when you have a long chain of differencing disks, you can generate a tremendous amount of disk-related overhead.

For some read and write operations, Virtual Server might need to access multiple files to find the "latest" version of the data. And, this problem will get worse over time. Committing undo disks and merging differencing disks with their parent VHDs are important operations that can help restore overall performance.

Fixed-size vs. dynamically expanding VHDs
The base type for VHDs you create can affect overall performance. Although dynamically expanding VHDs can more efficiently use of physical disk space on the host, they tend to get fragmented as they grow. Fixed-size VHDs are more efficient, because physical disk space is allocated and reserved when they're created.

The general rule is that if you can spare the disk space, you should go with fixed-size hard disks. Also, keep in mind that you can always convert between fixed-size and dynamically expanding VHDs if your needs change.

Host storage configuration
The ultimate disk-related performance limits for your VMs will be determined by your choice of host storage hardware.

One important decision (especially for lower-end servers) is the type of local storage connection. IDE-based hard disks will offer the poorest performance, whereas SATA, SCSI and Serial-Attached SCSI (SAS) will offer many improvements. The key to the faster technologies is that they can efficiently carry out multiple concurrent I/O operations (a common scenario when multiple VMs are cranking away on the same server).

When evaluating local storage solutions, keep a couple of key parameters in mind. The first is overall disk throughput (which reflects the total amount of data that can be passed over the connection in a given amount of time). The other important metric is the number of I/O operations per second that can be processed.

VM usage patterns often result in a large number of small I/O operations. Just as important is the number of physical hard disks that are available. The more physical disk spindles that are available, the better will be your overall performance.

Using RAID
Various implementations of RAID technology can make the job of placing VHD files easier. Table 1 provides a high-level overview of commonly used RAID levels and their pros and cons.

By utilizing multiple physical spindles in each array, performance can be significantly improved. Since multiple disks are working together at the disk level, the importance of manually moving VHD files to independent disks is reduced. And, of course, you'll have the added benefit of fault-tolerance.

Table 1: Comparing various RAID levels
RAID level RAID description Disk space cost Read performance Write performance
RAID I Disk mirroring 50% of total disk space No change No change
RAID 5 Stripe set with parity Equivalent to the size of one disk in the array. Increased Decreased
RAID 0 + I (also known as RAID I0) Mirrored stripe sets 50% of total disk space Increased No change

Virtual IDE vs. SCSI controllers
Virtual Server allows you two different methods for connecting virtual hard disks to your VMs: IDE and SCSI. Note that these options are independent of the storage technology you're using on the host server.

The main benefit of IDE is compatibility. Pretty much every x86-compatible operating system supports the IDE standard. You can have up to four IDE connections per VM, and each can have a virtual hard disk or virtual CD/DVD-ROM device attached.

Although IDE-based connections work well for many simpler VMs, SCSI connections offer numerous benefits. First, VHDs attached to an IDE channel are limited to 127GB, whereas SCSI-attached VHDs can be up to two terabytes in size. The virtual SCSI controller can support up to a total of 28 attached VHDs (four SCSI adapters times seven available channels on each)!

Figure 2 provides an overview of the possible disk configurations.

Figure 2: Hard disk connection interface options for VHDs

If that isn't enough, you have one more advantage. SCSI-attached VHDs often perform better than IDE-attached VHDs, especially when the VM is generating a lot of concurrent I/O operations. Figure 3 shows an overview of the available hard disk connections for a VM.

Figure 3: Configuring a SCSI-attached VHD for a VM

One helpful feature is that, in general, the same VHD file can be attached to either IDE or SCSI controllers without making changes. A major exception to the rule is the boot hard disk, as BIOS and driver changes will likely be required to make that work.

Still, the rule for performance is pretty simple. Use SCSI-attached VHDs whenever you can and use IDE-attached VHDs whenever you must.

When you're trying to set up a new Virtual Server installation for success, designing and managing VHD storage options is a great first step. Disk I/O bottlenecks are a common cause of real-world performance limitations, but there are several ways to reduce them.

In the next article, I'll talk about maintaining VHDs to preserve performance over time.

Optimizing Virtual Server directory
Series introduction
Monitoring CPU and memory resources | Managing CPU resource allocation
Designing virtual hard disk storage | Maintaining virtual hard disks
Using network-based storage | Optimizing network performance

Dig Deeper on Microsoft Hyper-V and Virtual Server