Manage Learn to apply best practices and optimize your operations.

Automating virtual performance and availability management: Five target areas

Too many organizations use manual methods to improve virtual performance and availability. But there are five areas where automated tools can improve results: discovery, physical infrastructure monitoring, virtual infrastructure monitoring, operational service monitoring and connecting the pieces.

Andi Mann
Performance and availability management are key for virtualized systems, because they are critical drivers for enterprises that deploy virtualization. In a recent Enterprise Management Associates study of more than 600 organizations, for example, 62% of respondents cite reducing downtime and 60% cite improvements to business continuity as critical drivers for their decisions to deploy virtualization.

But improved performance and availability management require automated tools to monitor your environment. In this tip, we explore five target areas where automated tools can achieve these results: Discovery, physical infrastructure monitoring, virtual infrastructure monitoring, operational service monitoring and connecting the pieces.

Availability and performance goals not achieved
The EMA research mentioned above also shows that enterprises are not consistently meeting these performance and availability goals. Almost 10% of respondents failed to achieve their goals across all three critical drivers of reducing downtime, improving DR/BCP, and better meeting service-level agreements (SLAs).

This failure to achieve availability and performance goals stems party from ineffective methods of monitoring and ensuring performance and availability. Many organizations try to monitor performance and availability manually, or with homegrown scripts, despite having too many moving parts for these approaches to be effective. It requires an in-depth understanding of the entire topology of an IT service (including servers, applications, databases, switches, storage and desktops), and the ability to conduct real-time and predictive measurements of the load and current response across every element of that topology. This is humanly impossible for all but the smallest environments and the simplest applications.

In addition, as with many other VSM disciplines, existing tools and processes designed to manage purely physical infrastructures are not going to work properly in a virtual environment. They are not going to understand that a virtual server does not own 100% of the physical CPU, memory and bandwidth. They will not be able to interpret the effect on response time of live migration (and may not even realize migration has occurred). They will not understand that the same IT service can be using one set of network resources (switches, routers, firewalls, etc.) one minute, and an entirely different set the next.

This is why it is essential to adopt a performance and availability management discipline using automated tools designed to handle complex physical and virtual infrastructures.

Five target areas for automated tools
At EMA, we advise our clients to look at five key capability areas when implementing processes and technologies for ensuring performance and availability in a virtual infrastructure.

  • Discovery .Tools should automatically locate, identify and provide insight into the complete topology of each IT service; and should subsequently maintain an up-to-date record of all these components. Discovery should be able to detect physical servers, virtualization platforms, virtual hosts and guests, and the applications on top of them, as well as the relationships and connections between them. Ideally it will also provide (or integrate with) a 'single source of truth' for storing discovered systems, such as a federated configuration management database (CMDB).
  • Physical infrastructure monitoring. Virtual environments always run on top of some physical infrastructure (and EMA research shows most enterprises plan to retain a substantial non-virtual environment for the foreseeable future). Therefore, it remains important to monitor the availability and performance of the underlying physical system and components. This includes detail of granular resources, network performance, file I/O, system uptime, response times, etc.
  • Virtual infrastructure monitoring.This is of course a critical differentiator for VSM tools. They must monitor performance metrics (response times, resource utilization, I/O rates, etc.) of dynamic virtual systems in real time. They should be able to track applications and components as they migrate, and still maintain appropriate (service-specific) performance profiles. They should also be able to monitor multiple virtual environments simultaneously – across multiple platforms, technologies, vendors, hosts, subnets and even data centers.
  • Operational service monitoring. In order to understand the performance and availability of a complete end-to-end service, tools must be able to monitor the complete operating environment that delivers that service. This includes servers, applications, databases, middleware, networks, storage, client connections and more, with an in-depth understanding of the virtual platforms, the resulting patterns of dynamic resource utilization, and how changes to specific component metrics will affect overall service performance.
  • Connecting the pieces. All these capabilities must be connected into a single view that holistically integrates visualization, event correlation, detailed reporting and predictive alerting. This should connect all the diverse physical and virtual components that deliver an IT service, and measure not just from the inside out (measuring performance and availability of components within the data center), but also from the outside in (measuring response time, availability and end-user experience at the client).

With these five core capabilities, enterprises can discover and understand the complex, dynamic physical and virtual infrastructure. They can also see how all components interconnect to deliver IT services, detect potential conditions that affect performance, maintain compliance with service-level agreements, and even predict and prevent potential problems before they occur.

ABOUT THE AUTHOR: Andi Mann is a research director with the IT analyst firm Enterprise Management Associates (EMA). Andi has over 20 years of IT experience in both technical and management roles, working with enterprise systems and software on mainframes, midrange, servers, and desktops. Andi leads the EMA Systems Management research practice, with a personal focus on data center automation and virtualization. For more information, visit the EMA website

Dig Deeper on Disaster recovery, failover and high availability for virtual servers

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.