Sashkin - Fotolia
Virtual machines rely heavily on storage resources, so excess storage latency can certainly precipitate poor performance in a VM. Fortunately, virtualization administrators have a variety of tools and tactics available to diagnose possible storage problems and help remedy ESXi network performance.
An easy place to start is with a diagnostic tool such as esxtop or Iometer. A tool like esxtop in VMware ESXi can report on storage performance per-host bus adapter, per-LUN or per-VM depending on the desired view, but each view can report the average response time per storage command -- the DAVG/cmd entry in an esxtop output. Esxtop can also report on ESXi network performance details, including the total number of commands per second, the average time that each command spends in the VMkernel (KAVG/cmd) and the average response time seen by the guest operating system (GAVG/cmd) which is the sum of DAVG and KAVG.
By comparison, a tool like Iometer can report I/O throughput to the storage device, revealing possible problems with specific disks or LUNs. A third source of information can be taken from log entries gathered by operating systems in afflicted VMs. For example, a log entry may report that a particular SCSI port or other device timed out, leading to ESXi network performance problems.
When esxtop, Iometer, logs or other tools suggest performance issues with a storage resource -- especially when results are compared against other "normal" VMs -- administrators can experiment with possible fixes. For example, an administrator can migrate VMs to a different storage location, reduce the number of VMs using the same LUN, or perform storage upgrades to accommodate heavier VM utilization. In addition, administrators can check for configuration problems like SCSI reservation conflicts, network device configuration issues -- such as jumbo frame setting oversights -- or even hardware/firmware compatibility issues.
Adjusting resource allocations, migrating VMs, shifting storage LUNs, updating host bus adapter firmware and taking most other corrective actions will require tangible changes in the production environment. This means troubleshooting demands careful documentation and change management for each step in the diagnostic and remediation process. Documentation not only keeps the environment up to date, it also provides a rollback path in the event that the change doesn’t work or incurs unintended consequences by impairing other workload or system behaviors.
Configuring storage LUNs for VMs
Understanding the files that make up a VM
Tracking down network performance problems
Dig Deeper on Virtual machine performance management
Related Q&A from Stephen J. Bigelow
Learn how load balancing in the cloud differs from a traditional network traffic distribution, and explore services available from AWS, Google and ... Continue Reading
Access management is critical to securing the cloud. Understand the differences between AWS IAM roles and users to properly restrict access to AWS ... Continue Reading
Containers have rapidly come into focus as a popular option for deploying applications, but they have limitations and are fundamentally different ... Continue Reading