kantver - Fotolia

Make the most of your vCPU resources with these tips

Be sure to provision the right virtual CPUs to a VM and be aware of processor hyper-threading. Also, use CPU affinity strategically and scale out rather than up.

Virtualization has vastly improved the utilization of compute resources such as CPU, memory, storage and network...

assets. But the convenience and flexibility virtualization provides can also cause enormous resource waste. A VM can consume significant resources, and over-allocating resources won’t necessarily boost performance -- further exacerbating resource waste and posing unnecessary costs to the business. Whether you're new to virtualization or simply revisiting established skills, it's good to take a fresh look at the way you deploy virtual resources in the data center, starting with virtual CPUs.

A hypervisor abstracts the underlying physical CPU cores of a server to render vCPUs. If the cores are hyper-threaded, each thread is rendered as a vCPU. For example, an eight core CPU would yield eight vCPUs, while an eight core hyper-threaded CPU would yield 16 vCPUs. An administrator can allocate vCPUs to a VM.

Today's servers use processors with powerful virtualization-enabling technologies. CPU virtualization support includes Intel VT-x and AMD-V; memory management unit (MMU) virtualization is provided by Intel Extended Page Tables and AMD Rapid Virtualization Indexing, and I/O MMU virtualization is handled by Intel VT-d and AMD-Vi. This means the underlying processors are typically providing top performance for the hypervisor and its VMs. In effect, the overhead imposed to implement and maintain virtualization on modern processors is almost nonexistent. The real issue is provisioning the right vCPUs to a VM.

Avoid over-provisioning vCPUs

When it comes to vCPU provisioning, less is almost always more. It's certainly possible to allocate more vCPUs than the VM actually requires. But a workload only uses as much processor time as it needs, so additional vCPUs don't guarantee additional vCPU utilization or workload performance. Since vCPUs assigned to one VM can't be used by other VMs, the additional or unneeded vCPUs are essentially wasted -- negating the benefit of virtualization in the first place.

Hyper-threading is an important technology that allows a CPU core to handle two instruction threads rather than just one.

The best approach is to monitor CPU utilization for the VM and the physical host server. Tools like VMware's esxtop or resxtop can help administrators measure the processing load. Generally, a CPU utilization level of 80% should be considered full utilization. As the CPU utilization level approaches 90%, it might be worth adding processors or migrating workloads to other servers. At 100% and more, the processor resources are certainly overloaded, and workloads running on those processors will likely experience performance degradation -- especially latency-sensitive applications.

Be aware of processor hyper-threading

Hyper-threading is an important technology that allows a CPU core to handle two instruction threads rather than just one. It's a valuable tool in some computing environments, allowing other tasks to gain vital processing time. The problem with hyper-threading is that it doesn't provide full performance. Even though a hypervisor will recognize a hyper-threaded CPU as two vCPUs, only one of the two will experience full performance. Since hyper-threading shares key parts of the CPU's instruction pipeline, the hyper-threaded process typically performs far slower than the first process.

Even though hypervisors like VMware ESXi encourage the use of hyper-threading, a server with hyper-threaded CPUs doesn't really provide twice the vCPUs. This means it could actually be better to select a server with twice the number of single-threaded cores -- or turn hyper-threading off in BIOS. For example, a server with 16 single-threaded vCPUs can offer better total performance than a server with eight hyper-threaded CPUs virtualized to 16 vCPUs.

If hyper-threading is enabled -- as hypervisor vendors typically recommend -- it's important to pay attention to the way vCPUs are allocated to VMs. Assigning a workload to vCPUs on the same hyper-threaded core -- such as vCPU 0 and vCPU 1 -- typically won't provide the same level of performance possible from assigning two vCPUs from two different physical cores, such as vCPU 0 and vCPU 2. Demanding workloads might not get adequate performance, leading to additional vCPU allocation and waste.

Use CPU affinity strategically

CPU affinity -- and anti-affinity -- features in the hypervisor can influence how VMs are placed and access CPU cores. CPU affinity allows administrators to commit one or more cores to a particular VM, while anti-affinity can explicitly prevent a VM from using certain CPU cores -- and the corresponding vCPUs. Affinity rules can be important for demanding workloads where administrators want to guarantee some minimum processor availability for the workload.

However, affinity rules can become problematic for hyper-threaded systems. If affinity is used to allocate vCPUs from the same physical processor, the action might result in inadequate VM performance. For example, if affinity is used to pin vCPU 0 and vCPU 1 to the same VM, those vCPUs are on the same physical -- hyper-threaded -- core, and the total performance is less than it would be if the administrator pinned vCPUs from different physical cores, such as vCPU 0 and vCPU 2.

Scale out rather than up

Generally speaking, a larger number of smaller VMs will yield better performance than a smaller number of larger -- more resource-intensive -- VMs. Administrators should consider implementing workloads as load-balanced VM clusters instead of creating a large VM to do the same job. This approach allows administrators to boost workload performance by adding minimal VMs to the cluster. It also builds workload resilience.

There are cases where the reverse is true and fewer, larger VMs will behave better. This can sometimes occur due to factors like memory and cache usage. Administrators must make this sort of "scale out versus scale up" decision by testing and tuning application performance to determine the best deployment choices.

Next Steps

Protect against CPU cache data theft

Master successful hyper-threading techniques

Scale resources on the Azure platform

Dig Deeper on Virtual machine provisioning and configuration