How does KVM/QEMU and guest OS handles page fault

Question

For example, I have a host OS (say, Ubuntu) with KVM enabled. I start a virtual machine with QEMU to run a guest OS (say, CentOS). It is said that to the host OS, this VM is just a process. So in the host's point of view, it handles page fault as usual (e.g., allocate page frame as needed, swap pages based on active/inactive lists if necessary).

Here is the question and my understanding. Within the guest OS, as it's still a full-fledged OS, I assume it still has all mechanisms handling virtual memory. It sees some virtualized physical memory provided by QEMU. By virtualized physical memory I mean the guest OS doesn't know it is in a VM, and still works as it would on a real physical machine, but what it has are indeed an abstraction given by QEMU. So even if a page frame is allocated to it, if that's not in guest's page table, the guest OS will still trigger a page fault and then map some page to the frame. What's worse, there may be a double page fault, where the guest first allocate some page frames upon page fault, which triggers page fault at host OS.

However, I also heard something like shallow (or shadow) page table which seems could optimize this unnecessary double page fault and double page table issue. I also looked at some other kernel implementation, specifically unikernels, e.g., OSv, IncludeOS, etc. I didn't find anything related to page fault and page table mechanisms. I did see some symbols like page_fault_handler but not as huge as what I saw in Linux kernel code. It seems memory management is not a big deal in these unikernel implementations. So I assume QEMU/KVM and some Intel's virtualization technologies have already handled that.

Any ideas in this topic? Or if you have some good references/papers/resources to this problem, or some hints would be very helpful.

prl · Accepted Answer · 2020-03-15T20:20:31.387

There are two ways for QEMU/KVM to support guest physical memory: EPT and shadow page tables. (EPT is an Intel-defined mechanism. Other processors support something similar, which I won't talk about here.)

EPT stands for Extended Page Tables. It is a second level of paging supported by the CPU in addition to the regular processor page tables. While running in a VM, the regular page tables are used to translate Guest Virtual Addresses into Guest Physical Addresses, while the EPT tables are used to translate Guest Physical Addresses into Host Physical Addresses. This double-level translation is performed for every memory access within the guest. (The processor TLBs hide most of the cost.) EPT tables are managed by the VMM while the regular page tables are managed by the guest. If a page is not present in the guest page tables, it causes a page fault within the guest, exactly as you have described. If a page is present in the guest page tables but not present in the EPT, it causes an EPT violation VM exit, so the VMM can handle the missing page.

Shadow page tables are used when EPT is not available. Shadow page tables are a copy of the guest page tables which incorporate both the GVA to GPA and GPA to HPA mappings within a single set of page tables. When a page fault occurs, it always causes a VM exit. The VMM checks whether the missing page is mapped in the guest page tables. If it is not, then the VMM injects the page fault into the guest for it to handle. If the page is mapped in the guest page tables, then the VMM handles the fault as it would for an EPT violation. Efficient management of shadow page tables across multiple processes within the guest can be very complex.

EPT is both simpler to implement and has far better performance for most workloads, because page faults are generated directly to the guest OS, which is generally where they need to be handled. The use of shadow page tables requires a VM exit for every page fault. However, shadow page tables may have better performance for a few specific workloads that cause very few page faults.

The answer to this question has more description of how shadow page tables work: https://stackoverflow.com/q/14176904 — prl, Mar 15 '20 at 20:16

How does KVM/QEMU and guest OS handles page fault

1 Answers1