BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
VMware ESXi includes the esxtop utility, which serves as an interactive command-line diagnostic. Among its many functions, esxtop can monitor storage performance by LUN, by host bus adapter or by VM. Let's take a look at some common esxtop commands to track ESXi storage performance.
Once you launch esxtop at the command line, use the U key for logical unit number (LUN) reporting, the D key for host bus adapter (HBA) reporting and the V key for virtual machine reporting. If you need to add or remove data fields from the display, press the F key, and then select the appropriate keys to toggle desired fields on or off. For example, esxtop's HBA view uses the B, C, D, E, H and J keys to toggle data fields; other modes use different keys. Use the S key to set the update frequency for those selected data fields. For example, selecting S and then 3 would update fields every three seconds.
Once esxtop is running and setup to monitor desired parameters, you can evaluate the four principal data columns reported for ESXi storage performance. The CMDS/s column represents the total number of commands per second, including both IOPS and SCSI commands -- though, this figure is primarily IOPS. The DAVG/cmd column is the average response time per command sent to the device, the KAVG/cmd column is the average amount of time that storage commands are in the VMkernel and the GAVG/cmd is the average response time reported by the guest operating system -- usually the sum of DVAG and KAVG. All of the response times are denoted in milliseconds.
Response times should typically not exceed 10 ms for prolonged periods of time. If you encounter unexpected latency, check the setup and configuration of the devices involved -- such as HBAs, LUNs, VMs, physical disks, network switches, network interface cards or Fibre Channel adapters -- for incorrect or overlooked settings. It might help to run esxtop on similar systems running at an acceptable level and compare the configuration of those normal system devices against the configurations at work in troubled systems -- this doesn't always reveal a direct cause, but might offer clues for further investigation.
Also, check logs, which may indicate additional details about ESXi storage errors or timeouts. For example, response times over 5,000 ms are typically logged as errors, along with abort and SCSI error messages. You can find the ESXi 5.x logs in /var/log/vmkernel.log.
Key storage performance metrics you should know
Identifying and troubleshooting storage bottlenecks
Your guide to managing storage in a virtual environment