In many production-level virtualization deployments, a lack of visibility and instrumentation often leads to overprovisioning, and a reduction in performance and service levels. This lack of visibility across the virtual server and storage layers -- from "server to spindle"-- is more than theoretical. According to research from the Taneja Group, many large enterprises using production-level virtualization report such issues, which can undercut the cost savings of server consolidation.
Without real-time instrumentation, IT operations can become reactive, labor-intensive and sloppy. In addition, if there is no clear picture of the root causes of performance problems, administrators are less likely to make adjustments to the infrastructure or automate management operations.
Here in part two of this series on optimizing virtual infrastructure management, we explore how to address this lack of management visibility -- including the must-have capabilities of a virtualized management technology -- and some common stumbling blocks in achieving an optimized infrastructure.
Building a cross-domain performance monitoring system
Virtual infrastructure optimization, or VIO, is an emerging category of virtualization management technologies designed to build a correlated performance profile of a virtual infrastructure to optimize its performance. VIO technologies go beyond capacity planning, which is limited to point-in-time snapshots and rule-of-thumb estimates, to size-specific tiers of a virtual infrastructure. While capacity-planning tools have rapidly emerged for virtual servers, none of these tools cross domains or incorporate runtime metrics.
Virtual infrastructure management technologies validate and continually verify capacity estimates by collecting, correlating and documenting runtime performance data from every virtualized tier in the application stack. A comprehensive VIO solution must include the following capabilities:
- Independence. You should choose a VIO tool that is independent of vendor bias in every virtualized application tier (server, storage, desktop, etc.) and remain a "neutral party" with respect to problem diagnosis.
- Depth. You should employ a technology that provides detailed metrics and offers multiple levels of depth to serve a wide range of needs. Also, it should provide real-time data management, metadata optimization and historical accuracy.
- Breadth. Your VIO technology should span multiple virtualization domains and provide composite data that clearly identifies performance dependencies. It should also integrate with existing systems monitoring and management tools, and offer a broad range of industry-standard communications interfaces.
- Impact. The technology you choose should be out-of-band, passive and as nondisruptive as possible. It should also offer deployment options that allow customers to incrementally add additional data collection modules as needed.
- Usability. The VIO technology should present actionable information in the form of a customizable dashboard run from a unified, configurable data store. In addition, it should support decision-making processes from teams whose functions span an entire virtual environment.
- Scalability. The technology should be scalable to support the largest enterprises, which may have tens-of-thousands of servers and tens of petabytes of storage.
Virtual infrastructure optimization includes capabilities from related disciplines (existing and emerging). There's an element of capacity planning in VIO technologies because these technologies are ideal for developing a performance baseline prior to deployment. But while most capacity planning tools address a single virtualized tier, VIO solutions are cross-domain. VIO also encompasses application service management and performance management capabilities, which focus on optimization from the application, or end-user, perspective.
Common problems in optimizing virtual infrastructures
The Taneja Group has interviewed several virtualization administrators who have deployed VIO solutions to resolve complex server and storage area network (SAN) contention issues. Every interviewee has experienced SAN response times that exceed both the original design goals and acceptable service levels, because they have deployed increasingly more production applications in virtual servers.
In most cases, storage and server administration teams were unable to agree on the root cause of the decline in performance, so emergency measures were implemented. Steps such as adding storage ports and taking nonproduction servers offline at peak load times failed to solve the problem. This left interviewees wrestling with the following set of questions:
- What is our optimal virtual host server-to-storage array ratio?
- How can we determine optimal storage traffic balance and overhead?
- What is the impact of storage on a new virtual server?
- How should virtualization impact our I/O path planning?
- How can we pinpoint vendor subsystem configuration issues faster?
After grappling with these questions, each interviewee deployed one of the leading VIO technologies. Initially, baseline data was collected and then augmented with ongoing runtime data collected during periods of high and low demand. By combining baseline and runtime data, customers could quickly validate root causes and discover additional configuration and architectural issues such as the following:
- Overloaded storage processors. Incorrect load balancing across storage ports.
- Unnecessary traffic. Noncritical file system management processes generated as much as 20% to 30% of traffic during peak loads.
- Storage port configuration issues. Queue-depth settings were suboptimal
- Firmware mismatches. Storage firmware versions were incorrect for use with VMware and created incompatibilities between edge and core switch firmware versions.
- Host bus adapter (HBA) issues. Round-robin host bus adapter configurations caused abnormally high-read latencies.
None of these issues were easily detected without a cross-domain VIO technology in place. These user testimonials validate the capabilities of available optimization technologies. In the third and final tip in this series, we'll examine some of the leading virtualization performance management tool providers and technologies in this emerging category of products.
|Dave Bartoletti is a senior analyst and consultant in the virtualization practice at Taneja Group. Bartoletti covers companies, trends and technologies in the server, storage, and network virtualization markets with a focus on management tools and strategies. Prior to joining Taneja Group, Bartoletti was the vice president of marketing at Enigmatec, a pioneering virtualization management vendor. Bartoletti has more than 20 years of technical, operational and marketing experience as an executive at Tibco, IBM and Fidelity Investments. He can be reached at firstname.lastname@example.org.|