Memory is critically important to virtual machine performance, but it’s not enough to simply add more physical memory to the server. Memory configuration and provisioning on each VM can profoundly impact performance and resilience, and help prevent scenarios that might waste memory and reduce possible consolidation opportunities. Let’s consider some issues of memory provisioning and allocation that can help boost workload performance in the data center.
How NUMA affects virtual machine memory allocation
Although we often consider memory as a uniform and ubiquitous pool available to every processor core, modern multi-processor server design employs non-uniform memory access (NUMA) architecture. NUMA designs break up the memory into per-core pools. The theory is that a processor (and all the cores on that processor) can access the local memory – the NUMA node – fastest, and accessing memory locations outside of the immediate NUMA node takes a little longer. NUMA architecture benefits virtual machine (VM) performance as long as the memory used by a VM remains within the same NUMA node.
For example, consider a server with two eight-core processors and a total of 128 GB of memory. In NUMA architectures, each processor would possess 64 GB in each physical bank, and each of the eight cores within that processor would have an 8 GB NUMA node.
How does this affect VM performance? Since each processor core can access the memory within its particular NUMA node faster than memory in other nodes, VMs can ideally realize best performance when the VM’s memory size is less than (or equal to) the size of its NUMA node. In this example, don’t allocate more than 8 GB per VM. It is certainly possible to allocate more memory, but it will essentially guarantee that the VM will be forced to access some portion of memory outside of its NUMA node – and suffer a somewhat reduced performance during those access cycles.
NUMA has slightly changed the way that memory is selected and installed on a data center server. Today, it’s not enough to simply add more RAM to an open bank. Additional memory must be matched and balanced between NUMA modes so that each processor package on the motherboard has the same portion of memory. If additional memory is needed on our example server, the new memory modules must be balanced between the banks. So, if a 64 GB upgrade is installed, the upgrade must split into 32 GB for each of the two processors (bringing the total of each bank to 96 GB and the total server memory to 192 GB), which would cause NUMA node sizes to increase from 8 GB to 12 GB.
How to best use dynamic memory allocation -- and when
Another memory configuration is dynamic memory, which represents a clear advance in resource allocation by allowing VMs to use memory based on their prevailing workload. If the workload causes the VM to need more memory, dynamic memory allocation will try to provide more memory from an available pool of idle memory space. If the workload isn’t using its full allocation of memory, some idle space can be returned to an available pool.
Unfortunately, dynamic memory allocation only makes sense when you plan to maximize consolidation or overcommit memory, which effectively creates more workloads than there is physical memory available on the server. Just consider the scenarios where dynamic memory allocation can cause problems.
As one example, workloads with static (constant) memory demands don’t need dynamic memory allocation because they will never add or take any significant amount of memory from the pool -- just provide a fixed allocation and you’re set.
At the other end of the spectrum, workloads that tend to take as much memory as possible can also be a problem for dynamic memory allocation because the workload will try to empty the available pool and never release memory for reallocation. This would reduce the potential consolidation for that underlying server. In extreme cases, dynamic memory allocation might use memory from across NUMA boundaries and reduce performance. As a result, these workload types are also best for fixed memory allocation or judiciously-configured memory buffer settings.
Somewhere in the middle of these extremes, IT administrators must be wary of workloads with erratic or unpredictable memory demands, which might require more memory at a time when the available pool is low or empty. This leads to workload performance problems and workload balancing activities, which could cause unexpected changes to consolidation efforts, as well as excess network activity as workloads migrate across the data center.
Ultimately, the best use of dynamic memory is for workloads with modest and predictable fluctuations in memory demands. Savvy IT professionals will take the time to test the effects of dynamic memory on workloads and performance before deploying the VM to production servers.
How to configure memory weight to improve VM performance
One problem with virtualization is that the push to maximize consolidation often leads to resource overcommitment. Consider that dynamic memory allocation really isn’t necessary if a server has all the memory it needs for the local workloads. The only reason to worry about dynamically adding and taking memory from a common pool is when you expect to host more VMs than there is physical memory available on the system – memory is overcommitted (even if it’s just a little) in order to maximize the number of workloads.
However, even with the smartest and most aggressive dynamic memory allocation, there may be some instances when there just isn’t enough memory to go around. When this happens, the hypervisor must decide which workloads will actually receive the dynamic allocation and which workloads must go without. This is the role of “memory weight,” which prioritizes dynamic memory allocation to workloads that are specified as more important than others while less important workloads will be left to take any potential performance penalty or migrate to other servers with more available resources.
Memory weight must be configured on a per-VM basis when dynamic memory allocation is established, but use caution when selecting weight assignments. Only important workloads need memory weight assignments, and be sure that the most critical workloads receive the highest weight because assigning the same weight to every workload makes the technology useless.
As server consolidation levels increase, a new suite of hypervisor features are evolving to improve the flexibility and availability of computing resources. NUMA architecture boosts memory access for each core, but can cause performance penalties when a VM is larger than the underlying NUMA node. Dynamic memory allocation can adjust limited memory allocation in response to variations in workload demands. Memory weight also assigns priorities to workloads receiving dynamic memory, but these technologies are not fool-proof. Ultimately, the best way to maintain NUMA boundaries and scale back reliance on features like dynamic memory allocation and memory weight is to carefully assess the memory requirements for all workloads and ensure that adequate server memory is available.