In several ways, physical CPU utilization is an important metric: if a company's utilization is too low, it's not effectively utilizing hardware (and thus is paying for unnecessary servers); if its utilization is too high, it can't provide sufficient resources for its systems (and thus suffers from slower response times for applications that are starved for resources). Organizations should strive for a consistently high physical CPU utilization that rarely, if ever, pushes 100% for extended periods, which also maintains some headroom for peaking applications.
Similar to CPU utilization, physical memory utilization not only demonstrates how efficiently resources are distributed but also indicates performance degradation when resources are overcommitted. As with CPU utilization, a best practice for physical memory utilization is to have consistently high but rarely maximum resource use .
While the average memory use was about 60%, best performers achieved memory utilizations of 80% or higher – although 5% of respondents ran at an unhealthy 90% to 100% utilization and, as a result, likely suffered performance problems.Network interface card (NIC) utilization
As measured by average utilization on a physical network interface card (NIC), network bandwidth utilization is a difficult but important metric. It must be taken in context. Virtual network interfaces, for instance, can be assigned to specific physical NICs. Therefore, in a server with multiple NICs, 100% utilization on one NIC does not mean the server is overloaded. It may instead indicate the need to allocate specific VMs to different NICs within the physical host. Excessive use of bandwidth, however, can mean that the server lacks sufficient physical network resources. Also, it usually indicates a negative impact on system and application performance in the server.
Additionally, network interface utilization is a particularly important for I/O-intensive systems and applications – such as databases, transaction servers or e-mail servers. High utilization can also indicate the need to upgrade to a higher throughput (Gigabit Ethernet, for example), the need to migrate to another server with more dedicated bandwidth or the need to move to a physical server with direct access to storage (e.g., local or channel attached).
Most enterprises, thoughrun between 30% to 40% utilization on average. The best performers in this category, however, utilize 70% to 90% of their available bandwidth. Nevertheless, it is important to understand that many other factors – including physical CPU and memory available – may preclude organizations from running more saturated network interfaces. As a result, the so-called below-average performers in this category are not necessarily able to improve their utilization at all. As a result, this should not be seen as poor performance, necessarily.Key VSM disciplines
By analyzing best performers' software, several management disciplines showed a high correlation with improvements in this fine-grained resource utilization, including the following:
- Change and configuration management software. The best performers in physical resource utilization were more likely to use software in this discipline, such as Tripwire Enterprise or EMC Ionix Server Configuration Manager. These solutions are able to detect specific configurations down to the CPU, memory or NIC level, and record any changes to these resources -- allowing workloads to use the best possible combination for optimal performance and utilization.
- Event management/console automation. The best performers were more likely to use software that detects, collects, correlates and responds to system and application events, such as HP Operations Manager or IBM Tivoli Enterprise Console. These tools can automatically detect, correlate, diagnose, pinpoint and even recover from performance and availability issues as they happen; thus, ensuring that higher resource utilization does not damage performance, even with more systems relying on fewer physical components.
- Automated backup/restore. The best performers were more likely to use software that automates system and data backup/restore operations, such as Vizioncore vRanger Pro or Symantec Backup Exec. These tools help to ensure recoverability in case of resource failure and enable architects to spend more time on optimization, instead of babysitting backups. But when matched with the right hardware (like the NetApp V-Series), they can actually offload workload from the server resources to storage resources instead.
These virtual systems management disciplines are not the only tactics that improve granular server resource use. EMA research indicates that many other techniques and strategies have significant positive impact, such as types of workloads and the physical resources themselves. Finally, a company's hypervisor choice can make a difference, especially in large-scale deployments.
|Andi Mann, is vice president of research with the IT analyst firm Enterprise Management Associates (EMA). Mann has over 20 years of IT experience in both technical and management roles, working with enterprise systems and software on mainframes, midrange, servers, and desktops. Mann leads the EMA Systems Management research practice, with a personal focus on data center automation and virtualization. For more information, visit EMA's website|
This was first published in October 2009