Jailhouse Root Cell Architecture: Execution Mode and Address Space Analysis

Overview

This document clarifies the fundamental architecture of Jailhouse's root cell, addressing common misconceptions about its execution mode and relationship with the hypervisor. The key insight is that the root cell does NOT run in VMX root mode and does NOT share virtual address space with the hypervisor.

Root Cell Execution Mode and Address Space Relationship

Key Answer: Root Cell does NOT run in VMX root mode

You're absolutely correct to question this! Here's the detailed explanation:

1. Root Cell runs in VMX non-root mode (guest mode)

// From driver/main.c - enter_hypervisor() function
static void enter_hypervisor(void *info)
{
    struct jailhouse_header *header = info;
    unsigned int cpu = smp_processor_id();
    int (*entry)(unsigned int);
    int err;

    entry = header->entry + (unsigned long) hypervisor_mem;

    if (cpu < header->max_cpus)
        /* either returns 0 or the same error code across all CPUs */
        err = entry(cpu);  // This transitions Linux to VMX non-root mode
    else
        err = -EINVAL;

    // After entry() returns, Linux is now running in VMX non-root mode
    // The hypervisor is in VMX root mode
}

2. Address Space Separation

A. Root Cell has its own virtual address space

// Root cell does NOT share virtual address space with hypervisor
// Each has independent page tables:

// Hypervisor address space:
// - Uses hv_paging_structs 
// - Maps hypervisor code/data
// - Has access to all physical memory

// Root cell address space:  
// - Uses Linux's original page tables (modified)
// - Hypervisor memory regions are unmapped or remapped to empty pages
// - Controlled by EPT (on x86) for memory access control

B. Hypervisor memory protection from root cell

// From hypervisor/setup.c - init_early()
// Back the region of the hypervisor core and per-CPU page with empty
// pages for Linux. This allows to fault-in the hypervisor region into
// Linux' page table before shutdown without triggering violations.

hv_page.virt_start = hyp_phys_start;
hv_page.size = PAGE_SIZE;
hv_page.flags = JAILHOUSE_MEM_READ;
while (hv_page.virt_start < hyp_phys_end) {
    if (virtual_console &&
        hv_page.virt_start == paging_hvirt2phys(&console))
        hv_page.phys_start = paging_hvirt2phys(&console);
    else
        hv_page.phys_start = paging_hvirt2phys(empty_page);  // Empty pages!
    error = arch_map_memory_region(&root_cell, &hv_page);
    hv_page.virt_start += PAGE_SIZE;
}

3. Why Hypercalls are Required

A. Privilege Level Separation

// Root cell (Linux) runs in VMX non-root mode
// - Cannot directly access hypervisor code/data
// - Cannot execute privileged VMX instructions
// - Needs hypercalls to request hypervisor services

// Hypervisor runs in VMX root mode
// - Has full system privileges
// - Controls EPT and VMCS
// - Handles VM exits from all cells

B. Hypercall Mechanism

// From driver/main.c - jailhouse_call implementation
// Root cell uses hypercalls just like any other cell:

#ifdef CONFIG_X86
static inline long jailhouse_call(unsigned long code)
{
    if (jailhouse_use_vmcall)
        return jailhouse_call_vmcall(code);  // VMX systems
    else
        return jailhouse_call_vmmcall(code); // SVM systems
}
#endif

// These instructions cause VM exit to hypervisor:
// VMCALL (Intel) or VMMCALL (AMD)

4. The Transition Process

A. Before Jailhouse activation:

Linux runs in:
- Ring 0 (kernel mode)
- VMX root mode (if VMX was off)
- Direct hardware access
- Full system control

B. During Jailhouse activation:

// From hypervisor/setup.c - entry() function
int entry(unsigned int cpu_id, struct per_cpu *cpu_data)
{
    // ... initialization ...

    // point of no return
    arch_cpu_activate_vmm();  // Activates VMX, puts Linux in non-root mode
}

C. After Jailhouse activation:

Linux (root cell) runs in:
- Ring 0 (still kernel mode)
- VMX non-root mode (guest mode)
- Controlled hardware access via EPT
- Hypervisor mediated system control

Hypervisor runs in:
- VMX root mode
- Full hardware control
- Handles all VM exits

5. Memory Access Control

A. EPT Controls Root Cell Memory Access

// Root cell's memory access is controlled by EPT
// Even though it's the "root" cell, it's still a guest

// Example: Root cell cannot access hypervisor memory
// - EPT maps hypervisor memory regions to empty pages
// - Any attempt to access hypervisor memory sees zeros
// - This provides strong isolation

B. Hypervisor Memory Layout

Physical Memory Layout:
┌─────────────────────┐ 
│   Hypervisor Code   │ ← Only accessible in VMX root mode
├─────────────────────┤
│   Root Cell Memory  │ ← Accessible to root cell via EPT
├─────────────────────┤
│   Other Cell Memory │ ← Not accessible to root cell
└─────────────────────┘

6. Why This Design?

A. Security Benefits

Root cell cannot compromise hypervisor
Strong isolation between hypervisor and all cells
Prevents privilege escalation attacks

B. Consistency

All cells (including root) are treated uniformly
Same hypercall interface for all cells
Simplified hypervisor design

C. Reliability

Hypervisor protected from root cell bugs
System remains stable even if root cell crashes
Clean separation of concerns

Detailed Architecture Diagram

graph TB
    subgraph "VMX Root Mode"
        HV[Jailhouse Hypervisor]
        HV_CODE[Hypervisor Code]
        HV_DATA[Hypervisor Data]
        HV_EPT[EPT Management]
    end

    subgraph "VMX Non-Root Mode"
        subgraph "Root Cell"
            RC_KERNEL[Linux Kernel]
            RC_DRIVER[Jailhouse Driver]
            RC_USER[User Space]
        end

        subgraph "Non-Root Cell"
            NC_OS[Guest OS]
            NC_APP[Applications]
        end
    end

    subgraph "Physical Memory"
        PM_HV[Hypervisor Memory]
        PM_RC[Root Cell Memory]
        PM_NC[Non-Root Cell Memory]
    end

    subgraph "Address Spaces"
        AS_HV[Hypervisor Virtual Address Space]
        AS_RC[Root Cell Virtual Address Space]
        AS_NC[Non-Root Cell Virtual Address Space]
    end

    HV --> HV_CODE
    HV --> HV_DATA
    HV --> HV_EPT

    RC_DRIVER -.->|VMCALL/VMMCALL| HV
    NC_OS -.->|VM Exit| HV

    HV_EPT --> PM_HV
    HV_EPT --> PM_RC
    HV_EPT --> PM_NC

    AS_HV -.->|Direct Mapping| PM_HV
    AS_RC -.->|EPT Controlled| PM_RC
    AS_NC -.->|EPT Controlled| PM_NC

    style HV fill:#ffcdd2
    style RC_KERNEL fill:#c8e6c9
    style NC_OS fill:#fff3e0
    style PM_HV fill:#f3e5f5

Key Misconceptions Clarified

Misconception 1: "Root cell runs in VMX root mode"

Reality: Root cell runs in VMX non-root mode, just like any other cell. Only the hypervisor runs in VMX root mode.

Misconception 2: "Root cell shares address space with hypervisor"

Reality: Root cell has its own virtual address space, completely separate from the hypervisor. Hypervisor memory is either unmapped or mapped to empty pages in the root cell's address space.

Misconception 3: "Root cell has direct hardware access"

Reality: Root cell's hardware access is mediated by the hypervisor through EPT and VM exit handling, just like non-root cells.

Misconception 4: "Hypercalls are only for non-root cells"

Reality: Root cell must use hypercalls to communicate with the hypervisor because it cannot directly access hypervisor code or data.

Comparison with Other Hypervisors

Traditional Type-1 Hypervisor (e.g., Xen)

Dom0 (privileged domain):
- Has special privileges
- Can access hypervisor interfaces
- Manages other domains
- Often shares some address space with hypervisor