When people think of storage, they generally think of a commoditized technology resource. In actuality, storage is a business-critical technology that needs to cost-effectively provide the right kind of security, availability, and accessibility. Put simply, it's about managing the entire lifecycle of data. Storage, like other technologies, is becoming more complex -- not less -- and needs to be managed accordingly.
Organizational growth, an increasing number of applications requiring storage, the storage needs surrounding critical information within emails, and regulatory requirements around data storage within specific industries have all had a tremendous impact on storage needs. In response to these requirements, storage resources seem to have grown in an organic, piecemeal fashion as immediate needs dictated. As a result, most organizations have a heterogeneous mix of technologies in place, many of which aren't meeting current needs.
Just as server sprawl has threatened the efficiency of data centers, decentralized, unmanaged storage also drives up costs. Organizations are struggling to keep track of and support myriad systems; if managing storage is not a priority or becomes a lesser priority, then the business risks loss of data as well as potential performance, availability, accessibility and compliance issues.
Many organizations are looking to virtualization as a way to control costs, reduce complexity and create order within storage environments. But just as server virtualization isn't a strategy unto itself, storage virtualization isn't either; rather, it's one of several possible elements of a resource optimization strategy. Virtualized storage does have its place in the data center, but it requires careful consideration in light of factors beyond whether or not it's technically possible.
Planning considerations to keep in mind include:
Plan your strategies prior to planning virtualization.
What is the resource optimization strategy for your data center? Do you have a disaster recovery strategy? While virtualization holds the promise of centralized management and leveraging lower-cost storage, you must first understand where your data center is headed overall. Virtualized storage is simply one tactic to be leveraged in the context of achieving the goal of operational optimization. For example, how many data centers are you going to have? Which functions will each one serve and where will they be located? What existing storage resources do you have, and what future storage needs are you facing? How will you manage daily storage, data protection, archival and disaster recovery tasks>
Storage virtualization can be leveraged to facilitate these tasks and reduce some of the complexity of their management. Deciding when storage virtualization is a critical enabler can be difficult. It is important therefore to evaluate the current business issues related to storage. Storage virtualization aggregates storage systems (such as arrays) from multiple providers into a networked environment that can be managed as a single pool. Many business critical applications require frequent snapshots and/or mirrors of the associated data. Storage virtualization can be used to provide this capability at a lower cost by enabling more cost effective storage to be used as the repository for regular snapshots or secondary data copies.
Classify your data.
While it sounds simple enough, this is where most organizations fail. This is a tricky course to navigate, as most business units will insist all of their data is "mission critical," so classifying data according to its criticality to the business can be difficult. Consider how much data you can afford to lose by assessing and quantifying risk tolerance. Classifying data can help you determine how to best leverage existing storage resources. For example, virtualized storage can allow you to use existing, lower performing storage technologies for less critical backup. However, it's unlikely that this would be a good approach for mission critical data. When classifying data, it's important to also look at the type of the data. For example, current storage virtualization techniques do not provide significant benefit to structured data types such as databases.
Define the data lifecycle(s).
Sixty percent of data is rarely touched 90 days after it's created. Understanding when to back up, when to archive, and how accessible data needs to be enables you to plan future requirements for various types of storage. If the lifecycle of your data is well understood then placing it on the right type of storage with the right characteristics for performance, availability, capacity, etc. is an achievable goal.
Once the data lifecycle has been defined, then developing a strategy of implementing tiered storage can be performed as a first step towards effective storage optimization. Storage virtualization can then be applied to help simplify the migration of data from one storage tier to another as it moves through its lifecycle. Hierarchical storage management (HSM) tools have been available for quite some time, but transparency to the application of the different physical storage platforms can present a problem with respect to how to move the data between different physical storage systems. By introducing virtualization, all the different storage systems and their associated volumes are centrally managed as logical volumes available to the server and HSM application.
Define service levels.
When defining storage tiers it is important to define service levels for data access, data security, data availability, performance (response times), data protection (RAID levels, backup and restore, archiving), etc., within your organization. These factors all influence the storage platform choice and cost of the overall storage solution. For example, a business application requiring 99.999% availability will require storage configured with equal or higher availability.
Knowing how quickly an application or server needs to be up and running after a failure is key to choosing the right storage backup and recovery strategy and associated technologies. Availability and performance required for different data sources can also change over time as the data cycles through its normal business lifecycle. This is why the data lifecycle for an application should be determined at development time, but then also be subject to regular reviews going forward. Implementing effective data lifecycle management requires process and cultural changes across technology teams – it's not just a one-time project.
How does the organization leverage all the various storage tiers to optimize their usefulness to the organization and to limit the impact of daily data management tasks? Storage virtualization can help to make the complexity of the various storage tiers more transparent to the end user. For example, an application requires a highly available disaster recovery strategy between multiple data centers. Storage virtualization can leverage data replication and/or mirroring technologies to allow copies of the data to be created at the relevant locations. Even though the primary copy of the data may reside on high-end performance storage, the various copies may not. Storage virtualization allows organizations to choose the storage platform that best meets data usage requirements.
Factor in security, compliance and regulatory considerations.
Like every new technology, virtualization (on both servers and storage) introduces new concerns for security and compliance. Virtualized environments require greater due diligence where security is concerned. Most security measures apply to physical properties of servers and their associated storage such as IP addresses, secured segments of your SAN, storage subsystems, etc. Since storage virtualization allows for easier data migration between storage device in the same virtualized pool and replication between heterogeneous disk sub-systems, secondary copies of business critical data must have the same level of security and access as the primary copies. For example, a secondary copy of sensitive data replicated to a DR (disaster recovery) site will require the same rigorous access protection as the primary copy.
Regulatory requirements in certain industries such as health care or financial services mandate caution in both storage and server virtualized environments. Critical customer information may need to be isolated either through access on a network or through a SAN. This may impact and influence where and how virtualized storage can be used.
Leverage existing resources.
Don't shut the door on existing resources and alternative storage technologies. Storage virtualization is just one tool to be considered in conjunction with other new storage technologies and existing resources. Rarely is it cost-effective, appropriate, or necessary to wipe the slate clean and start over with a new fully virtualized storage environment. Instead, organizations may redeploy older storage, leveraging virtualization to create online backups, snapshots or replicated copies for more cost-effective, faster access to secondary copies of critical data. Tape virtualization technologies, for example, can provide faster access to backup data and easier migration and consolidation of data backups to longer term storage media.
Storage virtualization technologies, while still relatively new, hold great promise for enabling resource optimization strategies within data centers when used appropriately for the right type of data. Ultimately, though, storage virtualization is not appropriate for all types of data and systems, and its use needs to be carefully considered and planned. In the short term, companies should really only consider storage virtualization strategies to solve specific problems, but as the technologies and the tools to automate data migration and provisioning evolve, organizations can apply storage virtualization to optimize their IT processes.