VMware Memory Management: Revisited

Over the years I have seen many scenarios and been involved in many discussions regarding memory management in VMware. The one thing I love about this technology is the open architecture and flexibility that they give you to “fine tune” your implementation to suit your needs since one design definitely does not fit all. However, one area that seems to come up again and again is memory management and the dynamic adjustments that occur based on current workloads through the use of various resource management policies.

I thought this topic could be explored a bit further since there are some misconceptions that are floating around out there.

Terminology References:

  • Physical Host Memory – What ESX “sees” as available.
  • Physical Guest Memory – What each virtual machine “sees” as available. This is supported by the memory present in the ESX node and provides a mapping from the vm’s to the hypervisor.
  • Virtual Guest Memory – This is memory that the applications utilize within a virtual machine.

I like to equate some technical terminology to tangible items or puns for two reasons, 1) its fun and 2) it provides a good reference and helps with memory recall (yet another pun).

Memory Usage in vSphere:

  • First and foremost: “Don’t write checks that your environment can’t cash” – In other words, do not allocate more memory to a virtual machine that is physically installed in the node.
    • This can lead to serious performance issues due to the fact that the ESX hypervisor now needs to reclaim the vm’s active memory state through the ballooning process or hypervisor swapping – which are great for short term resource constrains but shouldn’t be counted on to increase your ratio.
    • Despite popular claims that disabling the balloon driver or page sharing has in terms of performance impact, do not do this. Processors now have more than enough power to perform these actions with minimal to no impact to the node.
      • This is similar to the iSCSI offload discussion that I have had many times in the last year with TOE cards. If you are running into a performance issue with this, chances are you either have too high of a consolidation ratio for your node.
  • “The illusion of allocation” – The ESXi hypervisor is a penny pincher when it comes to physical memory allocation. It tells the application/OS that the memory is there for it, but doesn’t actually use it until its needed (kind of like thin provisioning in the storage world). Case and point: If a VM’s guest physical memory is endorsed by the host, ESXi does not need to reserve any more node memory for the virtual machine.
An equation to live by: VM’s Memory Usage <= VM’s Guest Memory Size + VM’s Overhead Memory

A quick review in how memory is shoveled out from the hypervisor.

The Contiguous Addressable Memory Space:

The ESXi hypervisor has this memory space set aside for any virtual machine when it is up and running and has the same properties as the virtual address space that the virtual machine is presenting to the applications that use it. This allows protection and isolation from each virtual machine running on that node. Out of this model, we have three distinct sections for memory that VMware classifies as 1) GUEST VIRTUAL MEMORY 2) GUEST PHYSICAL MEMORY 3) HOST PHYSICAL MEMORY

Follow the mapping…..

GuestOS Page Tables (guest virtual to guest physical) –> PMAP (guest physical to host physical) –> Shadow Page Tables (guest virtual to host physical)

3rd Generation Processors with hardware support:

As of the Intel Xeon 5500 and AMD Opteron series processors (roughly 2009 timeframe), they have incorporated hardware support for the virtualization of memory by providing two distinct layers of nested page tables. One of the layers stores guest virtual to guest physical and the other stores guest physical to host physical. The main goal of this hardware assist is to provide internal mappings fromt he logical pages to the physical pages as depicted here:

 

 

 

Leave a Reply