Sashkin - Fotolia
Virtual machines rely heavily on storage resources, so excess storage latency can certainly precipitate poor performance in a VM. Fortunately, virtualization administrators have a variety of tools and tactics available to diagnose possible storage problems and help remedy ESXi network performance.
An easy place to start is with a diagnostic tool such as esxtop or Iometer. A tool like esxtop in VMware ESXi can report on storage performance per-host bus adapter, per-LUN or per-VM depending on the desired view, but each view can report the average response time per storage command -- the DAVG/cmd entry in an esxtop output. Esxtop can also report on ESXi network performance details, including the total number of commands per second, the average time that each command spends in the VMkernel (KAVG/cmd) and the average response time seen by the guest operating system (GAVG/cmd) which is the sum of DAVG and KAVG.
By comparison, a tool like Iometer can report I/O throughput to the storage device, revealing possible problems with specific disks or LUNs. A third source of information can be taken from log entries gathered by operating systems in afflicted VMs. For example, a log entry may report that a particular SCSI port or other device timed out, leading to ESXi network performance problems.
When esxtop, Iometer, logs or other tools suggest performance issues with a storage resource -- especially when results are compared against other "normal" VMs -- administrators can experiment with possible fixes. For example, an administrator can migrate VMs to a different storage location, reduce the number of VMs using the same LUN, or perform storage upgrades to accommodate heavier VM utilization. In addition, administrators can check for configuration problems like SCSI reservation conflicts, network device configuration issues -- such as jumbo frame setting oversights -- or even hardware/firmware compatibility issues.
Adjusting resource allocations, migrating VMs, shifting storage LUNs, updating host bus adapter firmware and taking most other corrective actions will require tangible changes in the production environment. This means troubleshooting demands careful documentation and change management for each step in the diagnostic and remediation process. Documentation not only keeps the environment up to date, it also provides a rollback path in the event that the change doesn’t work or incurs unintended consequences by impairing other workload or system behaviors.
Configuring storage LUNs for VMs
Understanding the files that make up a VM
Tracking down network performance problems
Dig Deeper on Virtual machine performance management
Related Q&A from Stephen J. Bigelow
Containers have rapidly come into focus as a popular option for deploying applications, but they have limitations and are fundamentally different ... Continue Reading
ALM and SDLC both cover much of the same ground, such as development, testing and deployment. Where these lifecycle concepts differ is the scope of ... Continue Reading
Eliciting performance requirements from business end users necessitates a clearly defined scope and the right set of questions. Expert Mary Gorman ... Continue Reading