[Hypervisor Part 2] Hijacking Hyper-V's VM-Exit Handler from Inside the Guest

Part 1 established the fundamentals: Hyper-V runs below the Windows kernel, CPUID always causes a VM-exit, and the hypervisor processes every exit through a single handler function. This post covers the implementation: how EPTraitor installs itself into Hyper-V’s exit path and exposes a functional hypercall ABI from user mode.

Finding the VM-exit handler

Hyper-V’s VM-exit handler is not exported and has no public symbols that cover the hypervisor core. Finding it requires profiling VM-exits and working backward from known invariants.

The VMCS contains a field called HOST_RIP — the instruction pointer the CPU jumps to when a VM-exit fires. On Intel, VMREAD 0x6C16 reads this field. Since the VMCS is per-VP and accessible from the VMM, you can read HOST_RIP during boot (from the UEFI application context, before Hyper-V has set up full memory protections) to get the exact address.

c
// Read HOST_RIP from the current VMCS
// Intel SDM: VMCS field encoding 0x6C16 = HOST_RIP
uint64_t host_rip = 0;
__vmx_vmread(0x6C16, &host_rip);
// host_rip is now the address of Hyper-V's VM-exit handler

On AMD (SVM), the equivalent is the vmcb->control.event_inj area — specifically the nRIP field in the VMCB control area, or more directly by scanning for the #VMEXIT dispatch pattern in Hyper-V’s code at known RVAs from the load base.

The function signature we’re looking for processes every exit and receives the trap frame:

c
// Hyper-V's original exit handler — simplified signature
// (actual abi varies; x64 __fastcall, first arg is a per-VP state block)
typedef uint64_t (*vmexit_handler_t)(
    uint64_t a1, uint64_t a2,
    uint64_t a3, uint64_t a4);

Installing the detour

EPTraitor loads as a UEFI application. At that point, Hyper-V has been loaded by the Windows boot loader (hvloader.efi → hvix64.exe / hvax64.exe) but the guest OS is not yet running. We can read and write Hyper-V’s memory freely.

We install a standard 14-byte absolute jump at the head of the exit handler:

c
// 14-byte absolute jump: FF 25 00 00 00 00 [8-byte absolute addr]
static void write_absolute_jmp(void *target, uint64_t dest)
{
    uint8_t stub[14] = {
        0xFF, 0x25, 0x00, 0x00, 0x00, 0x00,  // JMP QWORD PTR [RIP+0]
        0,0,0,0,0,0,0,0                        // 8-byte destination
    };
    *(uint64_t *)(stub + 6) = dest;
    memcpy(target, stub, 14);
}

// During UEFI init, before any VM-exit has fired from the Windows guest:
original_vmexit_handler = (vmexit_handler_t)host_rip;
write_absolute_jmp((void *)host_rip, (uint64_t)&vmexit_handler_detour);

original_vmexit_handler is saved and called at the end of vmexit_handler_detour for any exit we don’t handle ourselves. This ensures Hyper-V continues to function normally for everything we don’t intercept.

The detour function

Every VM-exit from every virtual processor now passes through our code first:

c
// Our VM-exit handler detour
uint64_t vmexit_handler_detour(
    uint64_t a1, uint64_t a2,
    uint64_t a3, uint64_t a4)
{
    // First-exit initialisation (runs once, sets up heap/SLAT/logging)
    process_first_vmexit();

    uint32_t exit_reason = arch::get_vmexit_reason();
    //   arch::get_vmexit_reason() does: VMREAD 0x4402 (VM_EXIT_REASON) on Intel
    //                                   read vmcb->control.exit_code on AMD

    if (arch::is_cpuid_exit(exit_reason)) {
        // exit_reason == 10 on Intel (EXIT_REASON_CPUID)
        // exit_reason == 0x72 on AMD  (SVM_EXIT_CPUID)
        struct trap_frame_t *tf = arch::get_trap_frame();
        if (validate_hypercall_keys(tf->rcx)) {
            return dispatch_hypercall(tf);
        }
        // Not our hypercall — let Hyper-V emulate CPUID normally
    }
    else if (arch::is_ept_violation(exit_reason)) {
        // exit_reason == 48 on Intel (EXIT_REASON_EPT_VIOLATION)
        return handle_slat_violation();
    }
    else if (arch::is_nmi_exit(exit_reason)) {
        // NMI rendezvous for cross-VP coordination
        return handle_nmi();
    }

    // Everything else: fall through to original Hyper-V handler
    return original_vmexit_handler(a1, a2, a3, a4);
}

arch::get_trap_frame() reads the guest register state from the VMCS guest-state area. On Intel, the guest general-purpose registers are in the VMCS host save area (the layout depends on the hypervisor, not the Intel spec). Hyper-V saves them in a well-known per-VP structure we locate during first-exit init.

The hypercall ABI

Communication from the guest to the hypervisor uses CPUID with a packed 64-bit value in RCX:

cpp
// From shared/hypercall/hypercall_def.h
union hypercall_info_t {
    uint64_t value;
    struct {
        uint64_t         primary_key        : 16;  // must be 0x4E47
        hypercall_type_t call_type          : 8;   // which operation (was 4 bits — see below)
        uint64_t         secondary_key      : 7;   // must be 0x7F
        uint64_t         call_reserved_data : 33;  // call-specific payload
    };
};

constexpr uint64_t hypercall_primary_key   = 0x4E47;
constexpr uint64_t hypercall_secondary_key = 0x7F;

The guest packs RCX with the sentinel keys and call type, puts arguments in RDX/R8/R9, and fires CPUID. The hypervisor validates both keys before dispatching:

c
static bool validate_hypercall_keys(uint64_t rcx)
{
    hypercall_info_t info = { .value = rcx };
    return info.primary_key   == hypercall_primary_key &&
           info.secondary_key == hypercall_secondary_key;
}

A guest that executes CPUID for any other reason — hardware enumeration, kernel feature detection — passes through to Hyper-V’s normal emulation. The dual-key check ensures no false positives.

The raw hypercall thunk in assembly (x64, called from C as launch_raw_hypercall(info, rdx, r8, r9)):

nasm
; Thunk: puts hypercall_info_t value into RCX, fires CPUID,
; returns RAX (hypervisor writes return value there via VMCS guest RAX)
launch_raw_hypercall PROC
    ; args: rcx=hypercall_info_t, rdx=arg1, r8=arg2, r9=arg3
    ; CPUID clobbers RAX,RBX,RCX,RDX — save what we need
    push    rbx
    mov     rax, rcx          ; move info into RAX first (cpuid reads EAX)
    mov     rcx, rax          ; RCX = hypercall_info_t (hypervisor reads this)
    ; rdx, r8, r9 stay in place — hypervisor reads from VMCS guest registers
    cpuid                     ; → VM-exit; hypervisor runs; we resume here
    ; RAX = return value (written to guest RAX by hypervisor before VMRESUME)
    pop     rbx
    ret
launch_raw_hypercall ENDP

The hypervisor, on receiving a CPUID exit, reads the guest registers via VMREAD:

c
// Inside dispatch_hypercall() in the hypervisor:
uint64_t rcx_val = vmread(VMCS_GUEST_RCX);
uint64_t rdx_val = vmread(VMCS_GUEST_RDX);
uint64_t r8_val  = vmread(VMCS_GUEST_R8);
uint64_t r9_val  = vmread(VMCS_GUEST_R9);

hypercall_info_t info = { .value = rcx_val };
// dispatch on info.call_type ...

// Write return value to guest RAX before VMRESUME
vmwrite(VMCS_GUEST_RAX, return_value);
arch::advance_guest_rip();  // skip past the CPUID instruction

advance_guest_rip() adds the instruction length to the guest RIP in the VMCS so the guest doesn’t re-execute the CPUID on resume. On Intel this is VMREAD VM_EXIT_INSTRUCTION_LEN then VMREAD GUEST_RIP + that length → VMWRITE GUEST_RIP.

The bitfield bug that burned me

The original call_type field was 4 bits. That encodes values 0–15. The hypercall_type_t enum started small and was fine until I added more operation types:

cpp
enum class hypercall_type_t : uint64_t {
    guest_physical_memory_operation,   // 0  ← fine
    guest_virtual_memory_operation,    // 1  ← fine
    translate_guest_virtual_address,   // 2  ← fine
    read_guest_cr3,                    // 3  ← fine
    add_slat_code_hook,                // 4
    remove_slat_code_hook,             // 5
    hide_guest_physical_page,          // 6
    log_current_state,                 // 7
    flush_logs,                        // 8
    get_heap_free_page_count,          // 9
    read_process_cr3,                  // 10
    get_cached_kernel_cr3,             // 11
    // ... added later:
    configure_sysinfo_sanitizer,       // 16 ← 0x10 → truncates to 0 in 4-bit field
    sanitize_process_info,             // 17 ← 0x11 → truncates to 1 in 4-bit field
};

configure_sysinfo_sanitizer has enum value 16 (0b10000). Packed into a 4-bit field, the high bit is silently dropped — it becomes 0. Every call to configure the sanitizer was silently routing to guest_physical_memory_operation. The hypervisor returned 0 (success), the CLI printed “failed”, and I spent an embarrassing amount of time staring at logging output that showed no errors.

The fix: widen call_type to 8 bits. With 8 bits you get 0–255, which is more than enough for any reasonable hypercall table.

cpp
union hypercall_info_t {
    uint64_t value;
    struct {
        uint64_t primary_key        : 16;
        hypercall_type_t call_type  : 8;   // was 4 — widened after the truncation bug
        uint64_t secondary_key      : 7;
        uint64_t call_reserved_data : 33;
    };
};

Both sides (guest user-mode client and hypervisor) must use the same header. If they’re out of sync, the call_type values will be misinterpreted silently. This is the kind of bug that’s trivial to prevent (build both from the same shared header) and painful to diagnose when you forget.

Cross-process memory reads via CR3

With the hypercall ABI working, cross-process memory reads are straightforward. The guest provides a source virtual address and the source process’s CR3 (page directory base register). The hypervisor translates the VA to a guest-physical address using the provided CR3, then copies from the physical page into the destination buffer:

cpp
// Guest-side call: read `size` bytes from `src_va` in the process with `src_cr3`
// into `dst` (our local buffer)
uint64_t read_guest_virtual_memory(
    void *dst, uint64_t src_va, uint64_t src_cr3, uint64_t size)
{
    virt_memory_op_hypercall_info_t info = {};
    info.call_type          = hypercall_type_t::guest_virtual_memory_operation;
    info.memory_operation   = memory_operation_t::read_operation;
    info.address_of_page_directory = src_cr3 >> 12;  // store as PFN

    hypercall_info_t h;
    h.value = info.value;
    uint64_t dst_va = (uint64_t)dst;
    return launch_raw_hypercall(h, dst_va, src_va, size);
}

The hypervisor receives this, walks the target’s guest page tables using the provided CR3 (via SLAT to read the physical pages that the page tables occupy), finds the host physical address, and copies the data. No handle opened on the target process. No ReadProcessMemory. No kernel callbacks fire. The operation is entirely invisible to the guest OS.

Resolving a process CR3 from user mode

The missing piece: you need the target process’s CR3. From user mode, with no kernel driver. The solution is a chain of hypervisor calls that bootstraps from known starting points:

Step 1: Get the kernel’s CR3. read_guest_cr3 is a trivial hypercall — the hypervisor reads VMREAD GUEST_CR3 and returns it.

cpp
uint64_t kcr3 = hypercall_comms::read_guest_cr3();

Step 2: Resolve PsInitialSystemProcess (the SYSTEM EPROCESS pointer). File-map ntoskrnl.exe from disk, walk its export table to get the RVA of PsInitialSystemProcess, add it to the live kernel base (obtained via NtQuerySystemInformation(SystemModuleInformation) with debug privilege). This gives you the address of a pointer in the kernel that holds the SYSTEM EPROCESS.

cpp
// File-map ntoskrnl.exe, find PsInitialSystemProcess export RVA
uint32_t rva = get_export_rva_from_image(nt_path, "PsInitialSystemProcess");
uint64_t sym_addr = kernel_base + rva;
// Read the pointer at that address using the kernel CR3
uint64_t system_eproc = read_k64(sym_addr, kcr3);

Step 3: Auto-discover EPROCESS offsets. Hard-coding ActiveProcessLinks, UniqueProcessId, and DirectoryTableBase offsets breaks across Windows updates. Instead, scan for them by structural invariant:

cpp
// Find ActiveProcessLinks offset by looking for a valid doubly-linked list node
// in the SYSTEM EPROCESS: a candidate APL has Blink->Flink == candidate
for (uint64_t apl = 0x100; apl <= 0x400; apl += 8) {
    uint64_t node  = system_eproc + apl;
    uint64_t flink = read_k64(node + 0, kcr3);
    uint64_t blink = read_k64(node + 8, kcr3);
    if (is_kernel_va(flink) && is_kernel_va(blink) &&
        read_k64(blink, kcr3) == node) {
        found_apl = apl; break;
    }
}

// Find UniqueProcessId by scanning for a field that equals 4 (SYSTEM PID)
// at the expected offset range, different from the next EPROCESS
for (uint64_t up = 0x150; up <= 0x300; up += 8) {
    uint32_t pid0 = (uint32_t)read_k64(system_eproc + up, kcr3);
    uint32_t pid1 = (uint32_t)read_k64(next_eproc + up, kcr3);
    if (pid0 == 4 && pid1 != 4 && pid1 != 0) { found_up = up; break; }
}

// Find DirectoryTableBase by scanning for a value that looks like a CR3
// (non-zero, 4KB-aligned, in the right physical range)
for (uint64_t dtb = 0x20; dtb <= 0x90; dtb += 8) {
    uint64_t v = read_k64(system_eproc + dtb, kcr3);
    if ((v & 0xFFF) == 0 && v != 0 && v < 0x100000000ULL) {
        found_dtb = dtb; break;
    }
}

Step 4: Walk ActiveProcessLinks until you find the target PID, then read its DirectoryTableBase:

cpp
uint64_t node = read_k64(system_eproc + found_apl, kcr3);
while (node != start) {
    uint64_t eproc = node - found_apl;
    uint32_t pid   = (uint32_t)read_k64(eproc + found_up, kcr3);
    if (pid == target_pid) {
        uint64_t cr3 = read_k64(eproc + found_dtb, kcr3);
        return cr3;
    }
    node = read_k64(eproc + found_apl, kcr3);
}

The auto-discovery survives Windows updates that shift EPROCESS layout, which happens more often than you’d like. The structural invariants — a valid doubly-linked list entry, the known SYSTEM PID of 4 — are stable across versions.

With target_cr3 in hand, you can call read_guest_virtual_memory for any address in the target process. No handle, no driver, no ETW trace.

What comes next

Part 3 covers EPT shadow page hooks — using the same SLAT engine to make kernel function hooks invisible to the guest’s own page tables. The hook exists only at the physical address translation layer; a read of the hooked function returns original bytes. We use this to intercept NtQuerySystemInformation and sanitise the process list in memory, transparently and without touching a single byte of guest kernel code.