Storage for virtualization has changed data centers in many ways. Beyond the reclamation of space, organizations have reduced power consumption and centralized management.
By moving all storage requirements from a large number of underutilized systems to a centralized resource such as a storage area network (SAN), virtualization storage means that organizations can consolidate the storage resources and provide centralized management. These face-value benefits of storage for virtualization also put an incredible strain on many aspects of a typical data center by a rapid growth in storage consumption.
Most organizations that are new to virtualization may be venturing into the shared storage realm for the first time. For a SAN that absorbs a virtualization storage implementation, a number of challenges exist for infrastructure administrators.
Virtualization storage investment challenges
Moving all of the storage requirements from individual servers to a SAN is the best way to implement storage for virtualization, but there is a great cost associated with this task. Virtualized servers are typically huge storage consumers in many SAN environments. SAN storage is expensive, and connectivity requirements can magnify the cost of this virtualization storage method. In particular, Fibre Channel is an expensive media for a storage protocol among all of the switching requirements as well as host bus adapters (HBAs) on each server.
Storage controllers that provide virtualization storage also include an initial investment but can greatly reduce the overall cost for storage for virtualization installations. With these parameters, a potentially large initial storage investment may be required up front to correctly get started with virtualization storage.
Storage for virtualization and backups
Large server consolidation projects made possible by virtualization have fundamentally shifted the data requirements from a large number of relatively disconnected servers to central storage resources. This creates a series of opportunities for data protection strategies. Although typical backup and restore practices can be used in virtualized infrastructures, there are opportunities to be more efficient with data protection.
In the simplest scenario, traditional agent backups can be passed over in favor of SAN-based solutions. Many current virtualization-friendly storage products allow the centralized data resource to have a number of protection options. One example is NetApp SnapVault, which provides a disk-based backup solution on a SAN.
This disk-based backup takes a look at the aggregated block-level data for the contents on disk. In the case of virtualized servers, the blocks that are examined can span many virtualized servers. The SnapVault engine looks for the differences at the block level, skipping large blocks that are unchanged in the protection scheme.
The biggest benefit of SAN-based protection for virtualization storage is incredibly quick restore times, beyond being a simple and consolidated view into data protection. Disk-based backups and restores will always be faster compared to retrieving something from tape.
Many organizations are going the direction of having the SAN storage controller fully manage data protection, but this may not be an approach that all organizations can support. Bandwidth, off-site retention requirements and existing investments are obstacles to having fully implemented storage-managed data protection.
How all this storage adds up
Again, SAN storage is expensive. But with this class of storage for virtualization comes features that can make it pay for itself. One key feature is data de-duplication, where the SAN storage controller will look for similar blocks of data across disks. In lieu of having the same blocks written a large number of times on many disks, the controller will manage a single instance of these blocks.
Virtualization is the perfect match for de-duplication in that many servers are consolidated to the SAN and built from the same source. Consider a virtual machine (VM) template, which is how most virtualized environments deploy systems. As a VM is created from a template to create 10 VMs, those 10 machines will not differ much at a block level from each other in many situations. This is especially true for the operating system part of the VMs, even as they become patched and updated over their lifecycle.
The benefit of storage for virtualization here is that the storage requirement can literally be cut to a fraction of the raw amount. Some storage products that offer de-duplication even provide a de-duplication guarantee. NetApp is the current leader in the area of virtualization guarantees. In this program, NetApp guarantees that virtualization installations will need 50% less storage than that of the competition.
De-duplication benefits are one of the critical points in planning what virtualization storage to install. This approach can be used help make a cost model to determine what product fits each requirement.
With most SAN systems that have a front-end controller, there is some level of initial investment to justify the cost of storage for virtualization. When selecting a storage platform, it is important to determine how many terabytes of storage are required for a particular platform. If the virtualization storage requirement is only 3 TB, for example, it doesn't make sense to invest in a large dual-controller SAN to get a de-duplication benefit that can save a marginal amount of storage. If the virtualization storage requirement is 15 TB or more, then it would definitely make sense to make the investment in SAN that provides additional benefits.
About the author
Rick Vanover (firstname.lastname@example.org), vExpert, VCP, MCITP, MCTS, MCSA, is a virtualization expert in Columbus, Ohio. He is an IT veteran specializing in virtualization, server hardware, operating system support and technology management. Follow Rick on Twitter @RickVanover and click here for Rick's blogger disclosure.
This was first published in January 2011