d095d43e78
In xen_set_pte() if batching is unavailable (because the caller is in an interrupt context such as handling a page fault) it would fall back to using native_set_pte() and trapping and emulating the PTE write. On 32-bit guests this requires two traps for each PTE write (one for each dword of the PTE). Instead, do one mmu_update hypercall directly. During construction of the initial page tables, continue to use native_set_pte() because most of the PTEs being set are in writable and unpinned pages (see phys_pmd_init() in arch/x86/mm/init_64.c) and using a hypercall for this is very expensive. This significantly improves page fault performance in 32-bit PV guests. lmbench3 test Before After Improvement ---------------------------------------------- lat_pagefault 3.18 us 2.32 us 27% lat_proc fork 356 us 313.3 us 11% Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
||
---|---|---|
.. | ||
apic.c | ||
debugfs.c | ||
debugfs.h | ||
enlighten.c | ||
grant-table.c | ||
irq.c | ||
Kconfig | ||
Makefile | ||
mmu.c | ||
mmu.h | ||
multicalls.c | ||
multicalls.h | ||
p2m.c | ||
pci-swiotlb-xen.c | ||
platform-pci-unplug.c | ||
setup.c | ||
smp.c | ||
smp.h | ||
spinlock.c | ||
suspend.c | ||
time.c | ||
trace.c | ||
vdso.h | ||
vga.c | ||
xen-asm.h | ||
xen-asm.S | ||
xen-asm_32.S | ||
xen-asm_64.S | ||
xen-head.S | ||
xen-ops.h |