Drivers: hv: mshv_vtl: fix GUP into VTL0 device mappings#141
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the mshv_vtl_low mmap fault paths so VTL0 ZONE_DEVICE mappings become GUP-pinable again after the removal of the pte_devmap fast-path in 6.15. It does so by switching huge faults to folio-aware inserters and by making the 4K fault path insert a refcounted page once the pgmap exists, while providing an early pre-pgmap pte_special fallback and later zapping those stale PTEs.
Changes:
- Add a pgmap-backed PFN→
struct pageresolver and usevmf_insert_page_mkwrite()(4K) /vmf_insert_folio_pmd()/vmf_insert_folio_pud()(huge) so faults install folio-backed entries suitable for GUP. - Capture the
/dev/mshv_vtl_lowaddress_spaceon first open and invalidate early-faultpte_specialmappings after pgmap registration. - Tighten VMA flags for the mapping (
VM_MIXEDMAP+VM_DONTEXPAND) to support the mixed fallback and keep the mapping size pinned.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
09c2a53 to
dce4fdc
Compare
|
Adding the bug link for future reference. |
dce4fdc to
b727d48
Compare
|
Fixed KPA issues reported by Hardik offline. === [1/2] Drivers: hv: mshv_vtl: use folio-aware inserters for huge VTL0 mappings ===
: Added a comment, this is not practical with current design of OpenVMM:
=== [2/2] Drivers: hv: mshv_vtl: fix GUP into VTL0 mappings on the 4K fault path ===
|
Since v6.15 (aed877c, d3f7922), GUP no longer takes a pgmap reference for ZONE_DEVICE pages and walks huge entries through the unified folio path. With vmf_insert_pfn_{pmd,pud}() the mapping holds no folio reference, so a zap racing with pin_user_pages_fast() can briefly drop the folio refcount to 0 and trigger a WARN in try_grab_folio() with the I/O failing as -ENOMEM. Switch the PMD/PUD fault paths to vmf_insert_folio_{pmd,pud}(), mirroring drivers/dax/device.c. Each map takes folio_get(); the matching folio_put() in zap keeps the refcount above 0. Gate the huge inserters on pfn_valid() + ZONE_DEVICE + MEMORY_DEVICE_GENERIC via mshv_vtl_low_resolve_page(); fall back to VM_FAULT_FALLBACK when the folio order does not match PMD_ORDER/PUD_ORDER or the PFN is not yet pgmap-backed, so the core can retry at smaller order. Add VM_DONTEXPAND to the VMA to block mremap() growth past the pgmap. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Extend the folio-aware fault path to the 4K case so GUP into /dev/mshv_vtl_low works after MSHV_ADD_VTL0_MEMORY has registered the range. With the previous vmf_insert_mixed() path the PTE was always pte_special, vm_normal_page() returned NULL during pin_user_pages*(), follow_pfn_pte() returned -EEXIST, and io_uring O_DIRECT surfaced it as "disk io error: io error: File exists (os error 17)" on the first DMA into a freshly-registered VTL0 chunk. The 4K path now resolves the PFN via mshv_vtl_low_resolve_page(): when backed by an mshv_vtl pgmap the PTE is installed with vmf_insert_page_mkwrite(), giving GUP a normal pinnable page; otherwise it falls back to vmf_insert_mixed() so early CPU accesses (e.g. the VTL2 guest-memory self test reading GPA 0 before any add_vtl0_mem ioctl) still succeed instead of SIGBUSing. Such fallback PTEs would persist across registration and break later GUP. Capture the cdev's address_space on first open and, on successful MSHV_ADD_VTL0_MEMORY, invalidate the file-offset range via unmap_mapping_range() for both the encrypted (pfn) and decrypted (pfn | DECRYPTED_MASK) aliases that mshv_vtl_low_mmap() exposes. The next access re-faults into the folio path and GUP works. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
b727d48 to
c79bbfd
Compare
Upgrade kernel used in OpenVMM to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Upgrade kernel used in OpenVMM to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Upgrade kernel used in OpenHCL to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Co-authored-by: Naman Jain <namjain@linux.microsoft.com>
Upgrade kernel used in OpenHCL to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Co-authored-by: Naman Jain <namjain@linux.microsoft.com>
Restores GUP (get_user_pages) into VTL0 memory mappings, broken by the 6.15 ZONE_DEVICE / pte_devmap removal (aed877c, d3f7922). After that refactor, GUP only walks PTEs/PMDs/PUDs that point to a real folio with a held reference; the legacy pte_devmap fast-path is gone. mshv_vtl_low was still installing devmap PTEs via vmf_insert_pfn_*, so userspace pins on /dev/mshv_vtl_low mappings silently failed.
Two commits:
use folio-aware inserters for huge VTL0 mappings — switches the PMD/PUD fault paths to vmf_insert_folio_pmd / vmf_insert_folio_pud, resolving the pfn to its struct page / pgmap folio and verifying the folio order matches the fault order.
fix GUP into VTL0 mappings on the 4K fault path — adds a folio-aware 4K path using vmf_insert_page_mkwrite once the pgmap is live, with a pte_special fallback (via vmf_insert_mixed) for early faults before devm_memremap_pages has run. Captures the chardev address_space on first open (cmpxchg) and calls unmap_mapping_range for both the encrypted and DECRYPTED_MASK-aliased pfns after pgmap registration so any stale special PTEs are dropped and refaulted as folio-backed. VM_MIXEDMAP | VM_DONTEXPAND are set on the VMA.