When a limited number of workloads experience intermittent or persistent performance problems on the same physical server, it's certainly worth evaluating the potential effect of resource constraints to ensure the afflicted VMs have adequate CPU and memory resources available. It's a good idea to do this before attempting to migrate troubled workloads because inadequately-provisioned workloads will still be inadequately provisioned after migrating to another server, so migration won't necessarily reveal resource provisioning problems.
VMware ESXi provides integrated tools like esxtop that can report resource provisioning and help identify resources that may be excessively or unexpectedly overprovisioned -- possibly leading to network performance problems. Launch the VMware esxtop utility in interactive mode through the ESXi Shell and check the CPU's load average entry located below the uptime figure. A load average of 1.00 means the CPUs are fully utilized, an average below 1.00 means the CPUs are less than fully utilized, and an average above 1.00 means the CPUs are overcommitted.
For example, if you find a load average of 2.00, ESXi may need twice the available CPUs to accommodate the workloads. Combine this with the %READY figure which reports the percentage of time the VM was ready but could not schedule CPU time; this should normally be under 5%. If you find overcommitted CPUs and unusually high %READY levels, allocate more CPUs to the troubled workload or migrate the workload to another server where additional CPUs are available.
Also examine the "MEM overcommit avg" entry in the VMware esxtop output. This one is a little trickier because this entry is the ratio of requested memory to available memory minus one [(Requested Mem / Available Mem) - 1]. If the entry is zero, the requested memory equals the available memory and there is no memory overcommitment. If the entry is greater than zero, the requested memory is larger than the available memory, so there is memory overcommitment which may cause workload performance problems on the network.
For example, if the VMs want 2 GB but the host only has 1 GB, the MEM overcommit avg would read [(2/1)-1] 1; indicating overcommitment. Memory overcommitment can be resolved by adding more physical memory to the server, recovering unused memory from other VMs with excess memory allocations, or migrating VMs to other servers which can free memory for remaining VMs.
Skilled virtualization administrators can also check for excess memory ballooning or swapping activity caused by improperly set memory limits. These are represented by memory ballooning statistics (MCTLSZ) and memory swap statistics (SWCUR) reported by the VMware esxtop tool. For example, if the system reports unusually high ballooning (MCTLSZ) or swapping (SWCUR) numbers while ample memory is still available, performance may be impaired prematurely. In this situation, it may be possible to reconfigure ballooning or swapping activity to occur at more appropriate memory levels.
Using VMware esxtop to manage VM performance
Command-line tools to monitor vSphere performance
Top five free tools to manage VMware vSphere
Related Q&A from Stephen J. Bigelow
There are many different VM automation tools available -- some of them part of much wider product and feature suites. Determine which features you ... Continue Reading
Avoid automation issues by coordinating with the wider organization to ensure employees know how to provision resources and remain aware of evolving ... Continue Reading
Automating VMs isn't universally beneficial. Determine whether your organization needs mass production of VMs before deploying automation tools or ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.