How hypervisors dynamically allocate memory to improve VM performance

How hypervisors dynamically allocate memory to improve VM performance

Hypervisor vendors understand the critical importance of server memory, and major hypervisors have introduced a wide range of techniques designed to increase the effective amount of useful memory space on a virtual server. Hypervisor memory techniques can leverage virtualization to dynamically allocate memory and use resources more efficiently. Typically, these methods involve some form of automated provisioning, over-commitment or page swapping and increased attention to content sharing and compression. The first tip in this two-part series on memory management techniques offers an overview of techniques that dynamically allocate memory or reduce memory use.

Memory overcommit

Memory overcommit has been a prominent feature of VMware ESXi hypervisors. The goal of memory overcommit is to support more virtual machines (VMs) than there is physical memory available for on the server. For example, suppose a server has 2 GB of installed memory. Invoking memory overcommit can allow VM memory demands to total more than 2 GB. Without memory overcommit, a VM's memory size would be limited to the physical server memory space.

The idea behind memory overcommit is that VMs typically use less memory than is actually allocated to them. The hypervisor divides all of the server's virtualized memory into small "shares" and tracks which shares VMs are using and which are idle. The hypervisor can then allow VMs to use the unused shares (unused memory space) assigned to other VMs as needed. For example, one VM may experience a spike in memory use that exceeds the provisioned memory space. Memory overcommit can let the VM use idle shares to accommodate the spike and preserve VM performance (and the VM that owns the share in the first place is unaffected because it was not using those shares). Overcommit behavior is automatic, so IT administrators don't need to handle these processes manually.

So memory overcommit doesn't really use more memory than available, but rather makes more efficient use of available memory by working to allocate memory on demand, which puts unused memory space to work.

Dynamic memory

Microsoft's Hyper-V hypervisor first introduced dynamic memory in Windows Server 2008 R2 as a technique to allocate memory that isn't being used to VMs that needed more -- it has the same overall effect as memory overcommit, but is implemented differently. Dynamic memory is designed to reclaim some portion of unused memory from VMs and can then redistribute the reclaimed memory to other VMs that need it.

Dynamic memory allocation can also be prioritized based on specific VMs, allocating memory to VMs that are more important first. However, this kind of behavior may cause some VMs to rely on other memory techniques like disk-dependent paging or stop working entirely because the VM winds up running out of memory. Regular memory monitoring is the best way to manage server memory use and forestall problems.

Finally, dynamic memory also allows administrators to configure the amount of memory allocated to a VM at start time, the maximum amount of memory that a VM can use and the amount of unused memory that should be left on existing VMs (in case those VMs experience increased memory demand). These features give administrators more control over the way that system memory is reclaimed and reused, allowing more granular optimization of the server's consolidation.

Memory compression

Data compression techniques have long been used to reduce file sizes by replacing long sets of repeating data with shorter tokens. During decompression, the tokens are processed through a mathematical algorithm to restore the complete data once again. Although compression techniques impose too much processing overhead to be practical across main memory, memory compression has been implemented as a practical substitute for page swap files starting with VMware ESX 4.1.

In traditional page swapping, a page of server memory may be swapped to a disk swap file in order to free that memory space for other uses (such as use by other VMs). If the memory page is relatively unused (such as an unimportant or rarely accessed portion of workload instructions), it can be moved off to disk with little impact on the VM's performance and then swapped back from disk to memory when needed. The performance penalty is in the extra time it takes to read/write to the disk swap file.

The idea behind memory compression is to compress the memory page to be swapped, but swap the compressed page to a cache set aside in memory (sometimes called a compression cache), rather than on disk. This frees memory because the compressed page takes far less space and moving data in memory is faster than disk access. When the page is needed, it is simply decompressed from the swap space in memory and put back in memory in an uncompressed form.

Transparent page sharing

When a server runs multiple VMs, there is a possibility that some content may be duplicated between VMs. For example, the VMs on a particular server may be running the same operating system, applications, or data sets. In traditional virtualization, a hypervisor works by simply allocating memory space to each VM, without considering duplication of any content. But just as data deduplication technology has extended storage capacity by eliminating redundant storage data, hypervisors like VMware ESXi can improve memory utilization by reclaiming space used by redundant content and allowing multiple VMs to share the common memory content.

Transparent page sharing (TPS) can support substantially greater memory oversubscription than other techniques. For example, consider 10 VMs each using the same 12 MB of memory space for Windows Server 2012 files. That would be a total of 120 MB of duplicated memory space, which TPS could potentially reduce to just one 12 MB instance, freeing the remaining 108 MB that would otherwise be committed.

TPS identifies duplicate content through the use of hash values calculated for each memory page and held in a hash table. If the hash values for specific pages match, the hypervisor compares them at a byte level -- if there is a match, one page is used and the duplicate page is reclaimed. Since VMs don't see this remapping and operating systems cannot affect it, sensitive information cannot be leaked between VMs. If a VM tries to change the shared page, it will simply duplicate the page and allow changes there -- effectively creating a new memory page and preventing unwanted changes to other VMs using the shared data.

Dig Deeper on Virtual machine performance management