Jailhouse Root Cell Architecture: Execution Mode and Address Space Analysis
Overview
This document clarifies the fundamental architecture of Jailhouse's root cell, addressing common misconceptions about its execution mode and relationship with the hypervisor. The key insight is that the root cell does NOT run in VMX root mode and does NOT share virtual address space with the hypervisor.
Root Cell Execution Mode and Address Space Relationship
Key Answer: Root Cell does NOT run in VMX root mode
You're absolutely correct to question this! Here's the detailed explanation:
1. Root Cell runs in VMX non-root mode (guest mode)
// From driver/main.c - enter_hypervisor() function
static void enter_hypervisor(void *info)
{
struct jailhouse_header *header = info;
unsigned int cpu = smp_processor_id();
int (*entry)(unsigned int);
int err;
entry = header->entry + (unsigned long) hypervisor_mem;
if (cpu < header->max_cpus)
/* either returns 0 or the same error code across all CPUs */
err = entry(cpu); // This transitions Linux to VMX non-root mode
else
err = -EINVAL;
// After entry() returns, Linux is now running in VMX non-root mode
// The hypervisor is in VMX root mode
}
2. Address Space Separation
A. Root Cell has its own virtual address space
// Root cell does NOT share virtual address space with hypervisor
// Each has independent page tables:
// Hypervisor address space:
// - Uses hv_paging_structs
// - Maps hypervisor code/data
// - Has access to all physical memory
// Root cell address space:
// - Uses Linux's original page tables (modified)
// - Hypervisor memory regions are unmapped or remapped to empty pages
// - Controlled by EPT (on x86) for memory access control
B. Hypervisor memory protection from root cell
// From hypervisor/setup.c - init_early()
// Back the region of the hypervisor core and per-CPU page with empty
// pages for Linux. This allows to fault-in the hypervisor region into
// Linux' page table before shutdown without triggering violations.
hv_page.virt_start = hyp_phys_start;
hv_page.size = PAGE_SIZE;
hv_page.flags = JAILHOUSE_MEM_READ;
while (hv_page.virt_start < hyp_phys_end) {
if (virtual_console &&
hv_page.virt_start == paging_hvirt2phys(&console))
hv_page.phys_start = paging_hvirt2phys(&console);
else
hv_page.phys_start = paging_hvirt2phys(empty_page); // Empty pages!
error = arch_map_memory_region(&root_cell, &hv_page);
hv_page.virt_start += PAGE_SIZE;
}
3. Why Hypercalls are Required
A. Privilege Level Separation
// Root cell (Linux) runs in VMX non-root mode
// - Cannot directly access hypervisor code/data
// - Cannot execute privileged VMX instructions
// - Needs hypercalls to request hypervisor services
// Hypervisor runs in VMX root mode
// - Has full system privileges
// - Controls EPT and VMCS
// - Handles VM exits from all cells
B. Hypercall Mechanism
// From driver/main.c - jailhouse_call implementation
// Root cell uses hypercalls just like any other cell:
#ifdef CONFIG_X86
static inline long jailhouse_call(unsigned long code)
{
if (jailhouse_use_vmcall)
return jailhouse_call_vmcall(code); // VMX systems
else
return jailhouse_call_vmmcall(code); // SVM systems
}
#endif
// These instructions cause VM exit to hypervisor:
// VMCALL (Intel) or VMMCALL (AMD)
4. The Transition Process
A. Before Jailhouse activation:
Linux runs in:
- Ring 0 (kernel mode)
- VMX root mode (if VMX was off)
- Direct hardware access
- Full system control
B. During Jailhouse activation:
// From hypervisor/setup.c - entry() function
int entry(unsigned int cpu_id, struct per_cpu *cpu_data)
{
// ... initialization ...
// point of no return
arch_cpu_activate_vmm(); // Activates VMX, puts Linux in non-root mode
}
C. After Jailhouse activation:
Linux (root cell) runs in:
- Ring 0 (still kernel mode)
- VMX non-root mode (guest mode)
- Controlled hardware access via EPT
- Hypervisor mediated system control
Hypervisor runs in:
- VMX root mode
- Full hardware control
- Handles all VM exits
5. Memory Access Control
A. EPT Controls Root Cell Memory Access
// Root cell's memory access is controlled by EPT
// Even though it's the "root" cell, it's still a guest
// Example: Root cell cannot access hypervisor memory
// - EPT maps hypervisor memory regions to empty pages
// - Any attempt to access hypervisor memory sees zeros
// - This provides strong isolation
B. Hypervisor Memory Layout
Physical Memory Layout:
┌─────────────────────┐
│ Hypervisor Code │ ← Only accessible in VMX root mode
├─────────────────────┤
│ Root Cell Memory │ ← Accessible to root cell via EPT
├─────────────────────┤
│ Other Cell Memory │ ← Not accessible to root cell
└─────────────────────┘
6. Why This Design?
A. Security Benefits
- Root cell cannot compromise hypervisor
- Strong isolation between hypervisor and all cells
- Prevents privilege escalation attacks
B. Consistency
- All cells (including root) are treated uniformly
- Same hypercall interface for all cells
- Simplified hypervisor design
C. Reliability
- Hypervisor protected from root cell bugs
- System remains stable even if root cell crashes
- Clean separation of concerns
Detailed Architecture Diagram
graph TB
subgraph "VMX Root Mode"
HV[Jailhouse Hypervisor]
HV_CODE[Hypervisor Code]
HV_DATA[Hypervisor Data]
HV_EPT[EPT Management]
end
subgraph "VMX Non-Root Mode"
subgraph "Root Cell"
RC_KERNEL[Linux Kernel]
RC_DRIVER[Jailhouse Driver]
RC_USER[User Space]
end
subgraph "Non-Root Cell"
NC_OS[Guest OS]
NC_APP[Applications]
end
end
subgraph "Physical Memory"
PM_HV[Hypervisor Memory]
PM_RC[Root Cell Memory]
PM_NC[Non-Root Cell Memory]
end
subgraph "Address Spaces"
AS_HV[Hypervisor Virtual Address Space]
AS_RC[Root Cell Virtual Address Space]
AS_NC[Non-Root Cell Virtual Address Space]
end
HV --> HV_CODE
HV --> HV_DATA
HV --> HV_EPT
RC_DRIVER -.->|VMCALL/VMMCALL| HV
NC_OS -.->|VM Exit| HV
HV_EPT --> PM_HV
HV_EPT --> PM_RC
HV_EPT --> PM_NC
AS_HV -.->|Direct Mapping| PM_HV
AS_RC -.->|EPT Controlled| PM_RC
AS_NC -.->|EPT Controlled| PM_NC
style HV fill:#ffcdd2
style RC_KERNEL fill:#c8e6c9
style NC_OS fill:#fff3e0
style PM_HV fill:#f3e5f5
Key Misconceptions Clarified
Misconception 1: "Root cell runs in VMX root mode"
Reality: Root cell runs in VMX non-root mode, just like any other cell. Only the hypervisor runs in VMX root mode.
Misconception 2: "Root cell shares address space with hypervisor"
Reality: Root cell has its own virtual address space, completely separate from the hypervisor. Hypervisor memory is either unmapped or mapped to empty pages in the root cell's address space.
Misconception 3: "Root cell has direct hardware access"
Reality: Root cell's hardware access is mediated by the hypervisor through EPT and VM exit handling, just like non-root cells.
Misconception 4: "Hypercalls are only for non-root cells"
Reality: Root cell must use hypercalls to communicate with the hypervisor because it cannot directly access hypervisor code or data.
Comparison with Other Hypervisors
Traditional Type-1 Hypervisor (e.g., Xen)
Dom0 (privileged domain):
- Has special privileges
- Can access hypervisor interfaces
- Manages other domains
- Often shares some address space with hypervisor
Jailhouse Root Cell
Root Cell:
- No special privileges at hypervisor level
- Must use hypercalls like any other cell
- Cannot directly access hypervisor
- Completely separate address space
- Treated uniformly with other cells
Security Implications
Strong Isolation Benefits
- Hypervisor Protection: Root cell cannot corrupt hypervisor state
- Attack Surface Reduction: No direct interfaces between root cell and hypervisor
- Privilege Separation: Clear boundary between hypervisor and all cells
- Fault Isolation: Root cell bugs cannot crash the hypervisor
Consistency Benefits
- Uniform Treatment: All cells use the same hypercall interface
- Simplified Design: No special cases for root cell in hypervisor code
- Easier Verification: Consistent security model across all cells
Summary
The root cell in Jailhouse:
- Does NOT run in VMX root mode - it runs in VMX non-root (guest) mode
- Does NOT share virtual address space with the hypervisor
- Must use hypercalls because it cannot directly access hypervisor code
- Is controlled by EPT just like any other cell
- Provides strong isolation and security through this design
This architecture provides strong isolation and security, treating the root cell as a privileged but still controlled guest rather than giving it hypervisor-level access. This design choice prioritizes security and consistency over performance, which aligns with Jailhouse's focus on safety-critical and real-time applications.
评论