With multiple virtual machines (VMs) squeezed onto a single physical server, you need virtual memory management to control how they share resources.
Fortunately, the hypervisor uses numerous virtual memory management techniques to prevent shortages. These include transparent page sharing, ballooning and memory compression.
Transparent page sharing
Memory is collected into addressable areas called pages. When pages are small in size, it’s possible for two different pages to contain the exact same data, which can create virtual memory management problems.
To consolidate memory, hypervisors use a process called transparent page sharing, which hashes memory pages to find any with identical contents. When two pages are identical, they get merged into a single page that’s referenced by two pointers. This process works as long as the page’s contents don’t change. When a write does occur, the memory manager copies the contents to a new page and adjusts one pointer to enable the write to complete.
Separation of VM and host physical memory
One of today’s primary virtual memory management techniques involves creating a contiguous addressable memory space for each VM. The VM can use that address space for its memory needs, and the used portions are then mapped to physical memory on the host. (This process is similar to how a system creates address space for applications.)
Using address translation, the actual address of the VM’s physical memory will be different than how it’s perceived on the host. This abstraction of VM and host memory enables multiple VMs to operate simultaneously on the same host.
Hardware-assisted memory synchronization
But creating multiple address spaces only gets you so far. The more resources you devote to creating multiple address spaces, the less that are available for VMs. That’s why newer processor chipsets use hardware to offload some of the virtual memory management work. They do this by creating two layers of page tables instead of just one. The first layer maps a VM’s virtual memory to its physical memory, and the second maps VM physical memory to host physical memory.
When these two layers synchronize, it optimizes virtual memory sharing. Using hardware for this synchronization rather than software reduces the resources required to support the mapping, which improves overall performance.
All these layers complicate virtual memory management, making it tremendously difficult for a hypervisor to see a particular VM’s memory contents. Some hypervisors can’t see how much memory a VM requires or how much is going unused -- and therefore won’t know if one VM is consuming too much memory.
One virtual memory management technique for freeing unused memory is ballooning. Installed into each VM, balloon drivers can transfer the memory shortage from the host (where the shortage exists) to the VM (where it often doesn’t). The hypervisor informs the balloon driver when memory is running low and instructs it to inflate. By inflating, the balloon locks a set of unused memory in the VM.
The VM’s memory and host physical memory are mapped, so the hypervisor can reassign the physical memory to some other VM once the memory is locked. This virtual memory management process transfers the shortage because any locked memory remains unusable by the VM until the balloon driver deflates and releases the lock.
In cases of severe memory shortage, hypervisors can use the sledgehammer approach to virtual memory management, which involves swapping VM physical memory (and the associated host physical memory) directly to disk.
This process frees up memory, but swapping significantly reduces performance. During swapping, hypervisors use various methods to select which page to swap -- including sheer randomness. If you have an extreme memory shortage on both the VM and host, the hypervisor may swap a memory page to disk both within the guest and on the host. This double swapping is unavoidable with some hypervisors, but it can severely degrade performance.
Swapping physical memory to disk is a lose-lose situation. Even the best disk read-and-write times are far slower than what’s possible with RAM. Memory compression is one of the virtual memory management techniques that can help you avoid swapping.
With memory compression, a memory page that might get swapped to disk is instead compressed. If that compression is good enough -- say, a two-to-one ratio, which halves the size of the page -- the hypervisor can elect to retain the compressed page in physical memory. This process may alleviate the need to swap to disk, preserving performance. But some pages simply won’t compress very well, and you may be forced to swap to disk anyway.
In the end, making sure that you’ve always got enough physical memory to support the needs of colocated VMs is a better idea than any of these virtual memory management techniques. Most of these techniques only come into play when memory is in short supply.