Providing and managing storage resources in any IT environment can be a Herculean task. When you're using local storage, you often run into limitations based on the number of hard disks that can physically be attached to a single computer. Multiply these requirements by dozens or hundreds of servers, and the problem quickly becomes unmanageable.
Fortunately, centralized, network-based storage offers a potential solution. In this article, we'll look at how you can use network-based storage options to improve the performance and manageability of virtual machines running on Microsoft Virtual Server.
Effects of network-based storage
Using network-based storage can have several effects on overall performance; some are good, some are (potentially) bad.
Let's start with the positive. Disk and network caching, common on many storage solutions, can help increase overall performance. When using centralized storage, even relatively small solutions might have multiple gigabytes of high-speed memory cache. Anytime you can avoid physical disk access is a win from a performance standpoint.
Additionally, when using centralized storage, you can take advantage of advanced backup and recovery features, such as snapshots and split-mirror features (the terminology and technology vary by vendor).
There are some downsides to network-based storage. First and foremost is latency. Performing round trips across a network can be time-consuming and long delays could lead to VM crashes. Also, the added burden on the network when multiple VMs are trying to use resources can require infrastructure upgrades.
Overall, the benefits can outweigh the risks and difficulties (as long as you plan and test properly). With this in mind, let's look at some technical approaches.
Sharing virtual hard disks (VHDs)
The fact that VHDs are actually files comes with an unexpected benefit. Multiple VMs can access the same VHD files concurrently, as long as the VHD files are read-only. This is a great option if you're already planning to use undo disks and/or differencing disks, because the base or parent VHDs will be read-only anyway.
Although you might increase contention and generate "hot spots" on the host file system when sharing files with many VMs, caching can offset these effects. Only performance testing can provide the real numbers, but sharing meets your needs, you'll have the added benefit of minimizing physical disk space usage.
Using network-attached storage (NAS)
NAS devices provide access to files over a network connection. Standard Windows file shares are the most common example. NAS devices can support several different protocols, but in the Windows world, the CIFS standard is most common.
Microsoft's implementation (SMB) is the protocol that allows Windows users to access file shares. A simple approach involves configuring one or more virtual machines to access a virtual hard disk over the network using a UNC path instead of a local path. Figure 1 provides an example.
Figure 1: Accessing a VHD over the network
In order to implement this configuration, the Virtual Server service account must have access to the remote network location, and proper permissions must be set. Whenever a guest OS makes a disk I/O request, Virtual Server sends the request over the network to the VHD file located on the file share.
Using a storage area network (SAN)
SAN technology is based on low-latency, high-performance fibre channel networks. The idea is to centralize storage while providing the highest levels of disk compatibility and performance.
The major difference between SAN and NAS devices is that SANs use block-level I/O. This means that, to the host operating system, SAN-based storage is indistinguishable from local storage. You can perform operations such as formatting and defragmenting a SAN-attached volume. In contrast, with NAS-based access, you're limited to file-level operations.
The major drawbacks related to SANs are cost (fibre channel host bus adapters and switch ports can be very expensive) and management. Generally, a pool of storage must be carved into smaller slices, each of which is dedicated to a server. This can often lead to wasted disk space (although many vendors have introduced methods for more dynamically managing allocation).
Figure 2 shows a high-level logical view of a typical SAN implementation.
Figure 3: Combining NAS and SAN devices to store VHD files
The iSCSI standard was designed to provide the storage characteristics of SCSI connections over an Ethernet network. iSCSI clients and servers (called initiators and targets, respectively) are readily available from many different vendors.
As with SAN technology, iSCSI provides for block-level disk access. The major benefit of iSCSI is that it can work over an organization's existing investment in copper-based Ethernet (which is dramatically cheaper than fibre channel solutions). Some benchmarks have shown that iSCSI can offer performance similar to fibre channel solutions.
On the initiator side, iSCSI can be implemented as a software-based solution or can take advantage of dedicated accelerator cards.
Comparing network storage options
The bottom line for organizations that are trying to manage storage-hungry VMs is that several options are available for centralizing storage. One major caveat is that you should verify support policies with vendors. Unsupported configurations may work, but you'll be running without a safety net.
I can't overstate enough the importance of testing network-based storage configurations. Issues such as latency and protocol implementation nuances can lead to downtime and data loss. Overall, however, storing VHDs on network-based storage makes a lot of sense and can help reduce some major virtualization headaches.