How do the RAID level and number of disks in a RAID group affect storage performance for a virtual machine?
Both disk latency and RAID level can impact VM performance when accessing storage.
Disk latency should be the first and most important consideration. A single disk has only one spinning platter, so there is rotational latency as the platter rotates sectors underneath the read/write heads. Seek latency occurs when the read/write heads move between concentric tracks across the platter. And beyond mechanical disk latencies, the interface (such as serial-attached SCSI) is much faster than the disk's internal read/write speed, so the cache usually is filled during writing and emptied during reading -- the mechanical systems simply cannot readily keep up with the interface and host server.
The best way to help storage architects overcome disk latency is to group disks together into storage sets such as JBODs (just a bunch of disks) or RAID groups. By using a group of disks configured for RAID 0 (or involving RAID 0), file data is striped across the group. So, rather than waiting for a single disk to read or write an entire file, the file is broken up across multiple disks; each is responsible for a smaller piece of the file, which results in faster apparent storage performance.
Disk groups also support the establishment of disk redundancy and error correction and recovery. However, advanced RAID levels with parity (such as RAID 5) can introduce new latencies because the parity data must be calculated and stored along with the data. Double parity (RAID 6 or a variant) can experience even more latency due to multiple parity calculations.
It's impossible to quantify the exact effect of disk latency on VM performance. It really depends on the VM's use of storage and on the size and configuration of the RAID group. For example, a VM that receives only light user traffic and occasional snapshots might not be noticeably affected by disk latencies, while VMs that generate heavy user data, depend on active swap files and receive frequent snapshots will be more noticeably affected.
This was first published in September 2013