Deploying virtualization in a production data center can provide an interesting mix of benefits and liabilities. By consolidating workloads onto fewer servers, physical management is simplified; but managing storage for virtual machines (VMs) isn't always easier. Sizing storage capacity and distributing workloads, for instance, can be as tricky with VMs as it is in the physical server world.
In this tip, I'll present storage management considerations for virtualized environments and offer some suggestions for streamlining processes.Estimating storage capacity requirements
To understand your storage requirements as a whole, consider the following factors:
- the sum of the sizes of all "live" virtual disk files;
- predictions for expansion of virtual disk files;
- state-related disk files, such as those used for suspending VMs and maintaining point-in-time snapshots; and
- the space required for backups of VMs.
Accounting for all these considerations is a tall order. But hopefully the overall configuration is no more complicated than that of managing multiple physical machines.
Assigning virtual workloads
One of the best ways to reduce disk contention and improve overall performance is to profile virtual workloads to determine their requirements. Performance statistics help determine the number, size, and type of I/O operations. The table below shows an example of how to assign workloads to storage arrays based on their performance requirements.
Table 1: Assigning workloads to storage arrays based on performance requirements
In the table, the VMs are assigned to separate storage arrays to minimize contention. By combining VMs with compatible storage requirements on the same server, administrators can better distribute load and increase scalability.Selecting storage methods
As data center adminstrators deploy new VMs and plan for the storage required, they have several different options. The first is to use local server storage. Fault-tolerant disk arrays that are directly attached to a physical server can be easy to configure. For smaller virtualization deployments, this approach makes sense. But when capacity and performance requirements grow, adding more physical disks to each server can create management problems. Arrays, for example, are typically managed independently, which can lead to wasted disk space and require administrative effort.
That's where network-based storage comes in. By using centralized, network-based storage arrays, organizations can support many host servers using the same infrastructure. While support for technologies varies based on the virtualization platform, network-attached storage (NAS), iSCSI, and storage area network (SAN) storage options are the most common.
NAS devices use block-level I/O and are typically used as file servers. They can be used to store VM configuration and hard-disk files. But latency and competition for physical disk resources can be significant.
SAN and iSCSI storage solutions perform block-level I/O operations, providing raw access to storage resources. Through the use of redundant connections and multi-pathing, they can provide the highest levels of performance, the lowest latency and simplified management.
In order to determine the most appropriate option, data center managers should consider workload requirements for each host server and its associated guest operating systems. Details include the number and types of applications that will be run as well as their storage and performance requirements. Taken together, this information can help determine whether local or network-based storage is most appropriate.Monitoring storage resources
CPU and memory-related statistics often monitor for all physical and virtual workloads. In addition to this information, disk-related performance should be measured. Statistics collected at the host server level provide an aggregate view of disk activity and whether storage resources meet requirements. Guest-level monitoring can help administrators drill down into the details of which workloads generate the most activity. While the specific statistics that can be collected vary across operating systems, the kinds of information that should be monitored include the following:
- I/O per second (IOPs). This statistic refers to the number of disk-related transactions that occur at a given instant. IOPs are often used as the first guideline for determining overall storage requirements.
- Storage I/O utilization. This statistic refers to the percentage of total I/O bandwidth that is being consumed at a given point in time. High levels of utilization can indicate the need to upgrade or move VMs.
- Paging operations. Memory-starved VMs can generate significant I/O traffic due to paging to disk. Adding or reconfiguring memory settings helps to improve performance.
- Disk queue length. This involves the number of I/O operations that are pending. A consistently high number indicates that storage resources are creating a performance bottleneck.
- Storage allocation. Ideally, administrators can monitor the current amount of physical storage space that is in use for all virtual hard disks. The goal is to proactively rearrange or reconfigure VMs to avoid overallocation.
VM disk-related statistics will change over time. Therefore, the use of automated monitoring tools that can generate reports and alerts are an important component of any virtualization storage environment.Storage for VMs and teamwork
Managing storage capacity and performance for virtualized servers should be high on the list of responsibilities for data center administrators. Virtual machines can easily be constrained by disk-related bottlenecks, causing slow response times or even downtime. By making smart VM placement decisions and monitoring storage resources, data center administrators can surmount many of these potential bottlenecks. Above all, it's important for data center administrators to work together with storage managers to ensure that business and technical goals remain aligned over time.
About the author: Anil Desai is an independent consultant based in Austin, Texas. He specializes in evaluating, implementing and managing solutions based on Microsoft technologies. He has worked extensively with Microsoft's Server products and the .NET development platform and has managed data center environments that support thousands of virtual machines. Desai holds Microsoft Certified Systems Engineer, Microsoft Certified Solution Developer and Microsoft Certified Database Administrator certifications and is a Microsoft Most Valued Professional.