Bluo Blog

arXiv cs.OS 2026年2月论文深度分析报告

基于16篇操作系统领域前沿论文的系统性研究分析，涵盖AI Agent系统、内存管理、实时调度、量子计算等热门方向

论文总数

作者总数

研究方向

3.3

平均作者数/篇

研究方向热度分析

AI Agent
系统

实时
调度

内存
管理

状态
演化

量子
计算

安全
加固

分布式
系统

🤖 AI Agent 系统与资源管理 5篇

AI Agent的操作系统级支持成为最热门研究方向，涉及资源隔离、调度优化和状态管理等核心问题。

AgentCgroup: 首次系统性分析AI Agent在沙箱容器中的OS级资源动态，提出针对工具调用差异化的资源控制方案
Fork, Explore, Commit: 引入branch context抽象，为Agent探索提供copy-on-write状态隔离和原子提交语义
Right to History: 提出主权内核概念，为AI Agent执行提供防篡改可验证记录
ThunderAgent: 端到端感知的Agent推理系统，优化KV缓存和工具编排
CMV: 上下文记忆虚拟化，借鉴虚拟内存思想管理LLM状态

⏱️ 实时调度与安全 3篇

实时系统的调度优化和安全防护持续受到关注，GPU调度成为新热点。

TempoNet: 基于强化学习的实时调度器，使用Transformer架构和紧迫性分词器
GPU DAG调度: 针对安全关键应用的GPU任务依赖和并行调度分析方法
时序攻击防护: 在保证实时性的同时防护时序侧信道攻击

💾 内存管理与层级化 2篇

数据中心内存扩展和NUMA优化仍然是操作系统研究的核心议题。

Equilibria: 面向多租户CXL内存层级化的公平调度，解决生产环境中的核心痛点
NUMA迁移: 用户态细粒度NUMA迁移机制，提升跨NUMA区域查询性能

🔬 确定性状态演化 2篇

从理论角度探索确定性语义状态系统，为AI推理架构提供新范式。

BLGC: 有界局部生成器类理论，证明增量更新成本与系统维度无关
Compute ICE-AGE: 生产级C++实现验证确定性语义状态基底的可行性

⚛️ 量子操作系统 1篇

量子计算进入云时代，资源复用成为关键挑战。

HALO: 首个支持细粒度资源共享的量子操作系统，解决量子云平台资源利用率低的问题

🔐 安全加固 1篇

Unikernel安全性增强，填补传统安全机制缺失。

OSv ASLR: 为OSv Unikernel引入高效地址随机化，减少布局可预测性

🌐 分布式系统理论 1篇

从哲学角度审视云同步语义的根本问题。

iCloud分析: 揭示云同步的"范畴错误"，指出FITO假设的根本局限性

🏋️ LLM训练架构 1篇

突破GPU中心范式，探索内存为中心的训练架构。

Horizon-LM: 以RAM为中心的LLM训练架构，解耦模型规模与GPU内存容量

未来发展趋势预测

🚀

AI Agent操作系统将成为独立研究方向

随着AI Agent在云端大规模部署，传统容器隔离机制已无法满足其独特的资源需求模式。预计未来将出现专门针对Agent工作负载的OS抽象和调度策略，包括Agent感知的资源分配、工具调用感知的QoS保障、以及Agent间协作的系统级支持。

🧠

确定性状态管理将挑战概率式AI架构

BLGC和Compute ICE-AGE等工作提出了一种激进的替代范式：用确定性的图演化替代概率式的语义重构。虽然目前仍处于早期探索阶段，但这种思路可能为解决LLM的"幻觉"问题提供新路径。

⚡

CXL内存层级化将重塑数据中心架构

CXL技术正在改变数据中心的内存层次结构。Equilibria提出的多租户公平调度只是开始，未来将需要更复杂的内存放置策略、热页迁移算法和跨服务器内存池化管理。

🔒

实时系统安全将融合AI技术

TempoNet展示了强化学习在实时调度中的潜力，而时序攻击防护研究则关注安全与确定性的平衡。未来这两条线将交汇，形成AI驱动的自适应安全实时系统。

⚛️

量子操作系统将从研究走向实践

HALO证明了量子处理器细粒度复用的可行性。随着量子硬件的成熟，量子操作系统将成为连接量子硬件与上层应用的关键软件层。

作者关系与合作网络

高产作者（多篇论文）

作者	论文数	研究方向
R. Jay Martin / Raymond Jay Martin	2	确定性状态演化
Yusheng Zheng	2	AI Agent系统

大型研究团队（≥5人）

论文	团队规模	机构线索
Equilibria (CXL内存)	10人	企业级系统研究
TempoNet (实时调度)	10人	学术合作
ThunderAgent	10人	跨机构合作
GPU DAG调度	7人	学术研究
AgentCgroup	6人	企业研究

作者合作网络图

graph TD
    subgraph AI_Agent["AI Agent系统"]
        YZ1[Yusheng Zheng]
        JF[Jiakun Fan]
        QF[Quanzhi Fu]
        CW[Cong Wang]
        JZ[Jing Zhang]
        CS[Cosmo Santoni]
        HK[Hao Kang]
        
        YZ1 --> JF
        YZ1 --> QF
        YZ1 --> CW
    end
    
    subgraph Memory["内存管理"]
        KZ[Kaiyang Zhao]
        NG[Neha Gholkar]
        HM[Hasan Maruf]
        FS[Felix Schuhknecht]
        NR[Nick Rassau]
        
        KZ --> NG
        KZ --> HM
        FS --> NR
    end
    
    subgraph State_Evo["状态演化"]
        RJM[R. Jay Martin]
    end
    
    subgraph Quantum["量子计算"]
        JZY[John Zhuoyang Ye]
        JW[Jiyuan Wang]
        YQ[Yifan Qiao]
        JP[Jens Palsberg]
        
        JZY --> JW
        JZY --> YQ
        JZY --> JP
    end
    
    subgraph Realtime["实时系统"]
        RF[Rong Fu]
        YM[Yibo Meng]
        YZ2[Yuanhai Zhang]
        
        RF --> YM
    end
    
    style AI_Agent fill:#3b82f6,stroke:#1d4ed8,color:#fff
    style Memory fill:#8b5cf6,stroke:#6d28d9,color:#fff
    style State_Evo fill:#10b981,stroke:#059669,color:#fff
    style Quantum fill:#f59e0b,stroke:#d97706,color:#fff
    style Realtime fill:#ef4444,stroke:#dc2626,color:#fff

核心研究机构分析

🏢 企业研究团队

Equilibria团队：10人大型团队，涉及CXL内存层级化，从作者名单推测可能来自大型云服务提供商或存储系统公司。

AgentCgroup团队：Andi Quinn等人的参与暗示可能与Agent工具或云平台相关的研究机构。

🎓 学术研究团队

TempoNet团队：Simon James Fong等人的参与指向学术机构，可能涉及澳门大学等。

HALO团队：Jens Palsberg是知名PL研究者，团队可能来自UCLA。

GPU调度团队：Kai Huang等，可能来自国内高校。

👥 独立研究者

R. Jay Martin：两篇相关论文（BLGC + Compute ICE-AGE），专注确定性状态演化理论。

Cosmo Santoni：独立完成CMV论文，提出上下文记忆虚拟化。

Jing Zhang：独立完成Right to History论文，提出主权内核概念。

Paul Borrill：独立完成iCloud分析，从分布式系统角度审视云同步。

🔗 跨领域合作

ThunderAgent团队：10人团队涵盖系统、AI和架构研究，包括Beidi Chen、Tushar Krishna等知名研究者，体现AI系统研究的跨学科特性。

关键技术创新

🎯 系统抽象创新

Branch Context: 为Agent探索提供copy-on-write状态隔离的新OS抽象
主权内核: 为AI Agent执行提供防篡改记录的安全抽象
上下文记忆虚拟化: 借鉴虚拟内存的页面置换思想管理LLM上下文
量子时分复用: 实现量子处理器的细粒度共享

⚙️ 调度算法创新

端到端感知调度: ThunderAgent的全局KV缓存和工作流感知
强化学习调度器: TempoNet的Transformer+DQN组合
GPU DAG调度: 处理GPU内核间依赖的实时分析方法
CXL公平分层: Equilibria的多租户内存放置策略

🔒 安全机制创新

时序攻击缓解: 在保持实时性的同时隐藏任务执行模式
Unikernel ASLR: 为单一地址空间系统引入地址随机化
执行历史防篡改: 基于内核的可验证Agent行为记录

📐 理论框架创新

BLGC理论: 证明增量更新复杂度与系统规模解耦
范畴错误分析: 从哲学角度揭示分布式同步的根本局限
Agent资源动态模型: 首次系统性刻画Agent工具调用的OS级资源特征

方法论创新

📊

AI驱动的系统优化

TempoNet将强化学习引入实时调度，使用紧迫性分词器离散化时间松弛度，并采用稀疏注意力处理无序任务集。这代表了AI与系统研究的深度融合趋势。

🔬

系统化实证研究

AgentCgroup通过144个SWE-rebench任务的系统测量，揭示了AI Agent在容器中的资源动态规律。这种数据驱动的研究方法正在成为系统研究的新范式。

🧪

理论先行、实现验证

Compute ICE-AGE采用"先数学规范、后工程实现"的研究路径，在生产级C++实现中验证BLGC理论。这种严谨的研究方法值得借鉴。

论文详细列表

Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

Kaiyang Zhao, Neha Gholkar, Hasan Maruf, Abhishek Dhanotia, Johannes Weiner, Gregory Price, Ning Sun, Bhavya Dwivedi, Stuart Clark, Dimitrios Skarlatos

数据中心内存扩展的CXL解决方案，解决多租户环境下的内存层级化公平调度问题。
CXL内存管理多租户
Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requires software-level tiering for hyperscaler workloads. Existing tiering solutions, including current Linux support, face fundamental limitations in production deployments. First, they lack multi-tenancy support, failing to handle stacked homogeneous or heterogeneous workloads...
Taking the Leap: Efficient and Reliable Fine-Grained NUMA Migration in User-space

Felix Schuhknecht, Nick Rassau

用户态NUMA内存迁移机制，优化跨NUMA区域的并行查询性能。
NUMA内存迁移用户态
Modern multi-socket architectures offer a single virtual address space, but physically divide main-memory across multiple regions, where each region is attached to a CPU and its cores. While this simplifies the usage, developers must be aware of non-uniform memory access (NUMA), where an access by a thread running on a core-local NUMA region is significantly cheaper than an access from a core-remote region...
AgentCgroup: Understanding and Controlling OS Resources of AI Agents

Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, Andi Quinn

首次系统性分析AI Agent在沙箱容器中的OS级资源动态，基于144个任务的实证研究。
AI Agent资源控制cgroup
AI agents are increasingly deployed in multi-tenant cloud environments, where they execute diverse tool calls within sandboxed containers, each call with distinct resource demands and rapid fluctuations. We present a systematic characterization of OS-level resource dynamics in sandboxed AI coding agents, analyzing 144 software engineering tasks from the SWE-rebench benchmark across two LLM models...
Contextual Memory Virtualisation: DAG-Based State Management for LLM Agents

Cosmo Santoni

借鉴虚拟内存思想，将LLM累积理解作为版本控制的状态进行管理。
LLM虚拟内存状态管理
As large language models engage in extended reasoning tasks, they accumulate significant state -- architectural mappings, trade-off decisions, codebase conventions -- within the context window. This understanding is lost when sessions reach context limits and undergo lossy compaction. We propose Contextual Memory Virtualisation (CMV), a system that treats accumulated LLM understanding as version-controlled state...
HALO: A Fine-Grained Resource Sharing Quantum Operating System

John Zhuoyang Ye, Jiyuan Wang, Yifan Qiao, Jens Palsberg

首个支持量子处理器细粒度资源共享的操作系统，解决量子云平台的资源利用率问题。
量子计算操作系统资源共享
As quantum computing enters the cloud era, thousands of users must share access to a small number of quantum processors. Users need to wait minutes to days to start their jobs, which only takes a few seconds for execution. Current quantum cloud platforms employ a fair-share scheduler, as there is no way to multiplex a quantum computer among multiple programs at the same time...
Fork, Explore, Commit: OS Primitives for Agentic Exploration

Cong Wang, Yusheng Zheng

为Agent探索引入branch context抽象，提供copy-on-write状态隔离和原子提交语义。
AI Agent进程隔离原子操作
AI agents increasingly perform agentic exploration: pursuing multiple solution paths in parallel and committing only the successful one. Because each exploration path may modify files and spawn processes, agents require isolated environments with atomic commit and rollback semantics for both filesystem state and process state. We introduce the branch context, a new OS abstraction...
Right to History: A Sovereignty Kernel for Verifiable AI Agent Execution

Jing Zhang

提出主权内核概念，为AI Agent执行提供防篡改、可验证的历史记录。
AI Agent安全审计
AI agents increasingly act on behalf of humans, yet no existing system provides a tamper-evident, independently verifiable record of what they did. As regulations such as the EU AI Act begin mandating automatic logging for high-risk AI systems, this gap carries concrete consequences -- especially for agents running on personal hardware, where no centralized provider controls the log...
Bounded Local Generator Classes for Deterministic State Evolution

R. Jay Martin

形式化定义有界局部生成器类，证明增量更新成本与系统维度无关。
形式化方法状态演化确定性系统
We formalize a constructive subclass of locality-preserving deterministic operators acting on graph-indexed state systems. We define the class of Bounded Local Generator Classes (BLGC), consisting of finite-range generators operating on bounded state spaces under deterministic composition. Within this class, incremental update cost is independent of total system dimension...
ThunderAgent: A Simple, Fast and Program-Aware Agentic Inference System

Hao Kang, Ziyang Li, Xinyu Yang, Weili Xu, Yinfang Chen, Junxiong Wang, Beidi Chen, Tushar Krishna, Chenfeng Xu, Simran Arora

端到端感知的Agent推理系统，优化KV缓存管理和工具编排。
AI Agent推理系统KV缓存
Large language models(LLMs) are now used to power complex multi-turn agentic workflows. Existing systems run agentic inference by loosely assembling isolated components: an LLM inference engine (e.g., vLLM) and a tool orchestrator (e.g., Kubernetes). Although agentic workflows involve multiple LLM and tool requests, these systems schedule and allocate resources separately on a per-request basis...
Exploiting Dependency and Parallelism: Real-Time Scheduling and Analysis for GPU Tasks

Yuanhai Zhang, Songyang He, Ruizhe Gou, Mingyue Cui, Boyang Li, Shuai Zhao, Kai Huang

针对安全关键应用的GPU任务DAG调度分析方法。
GPU调度实时系统DAG
With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel computing; however, the complex data dependencies and resource contention across kernels within a GPU task may unpredictably delay its execution time...
Horizon-LM: A RAM-Centric Architecture for LLM Training

Zhengqing Yuan, Lichao Sun, Yanfang Ye

以RAM为中心的LLM训练架构，解耦模型规模与GPU内存容量。
LLM训练内存架构分布式系统
The rapid growth of large language models (LLMs) has outpaced the evolution of single-GPU hardware, making model scale increasingly constrained by memory capacity rather than computation. While modern training systems extend GPU memory through distributed parallelism and offloading across CPU and storage tiers, they fundamentally retain a GPU-centric execution paradigm...
The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

Raymond Jay Martin

BLGC理论的生产级实现，验证确定性语义状态基底的可行性。
确定性计算图引擎AI架构
This paper presents empirical results from a production-grade C++ implementation of a deterministic semantic state substrate derived from prior formal work on Bounded Local Generator Classes (Martin, 2026). The system was mathematically specified prior to implementation and realized as a CPU-resident graph engine operating under bounded local state evolution...
Hardening the OSv Unikernel with Efficient Address Randomization

Alex Wollman, John Hastings

为OSv Unikernel引入ASLR风格的安全加固，减少内存布局可预测性。
UnikernelASLR安全
Unikernels are single-purpose library operating systems that run the kernel and application in one address space, but often omit security mitigations such as address space layout randomization (ASLR). In OSv, boot, program loading, and thread creation select largely deterministic addresses, leading to near-identical layouts across instances and more repeatable exploitation...
TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler

Rong Fu, Yibo Meng, Guangzhen Yao, Jiaxuan Lu, Zeyu Zhang, Zhaolu Kang, Ziming Guo, Jia Yee Tan, Xiaojing Du, Simon James Fong

基于强化学习的实时调度器，使用Transformer和紧迫性分词器。
强化学习实时调度Transformer
Real-time schedulers must reason about tight deadlines under strict compute budgets. We present TempoNet, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation. An Urgency Tokenizer discretizes temporal slack into learnable embeddings, stabilizing value learning and capturing deadline proximity...
Mitigating Timing-Based Attacks in Real-Time Cyber-Physical Systems

Arkaprava Sain, Sunandan Adhikary, Soumyajit Dey

在保证实时性的同时防护时序侧信道攻击，兼顾闭环控制行为。
实时系统安全侧信道攻击
Real-time cyber-physical systems depend on deterministic task execution to guarantee safety and correctness. Unfortunately, this determinism can unintentionally expose timing information that enables adversaries to infer task execution patterns and carry out timing-based attacks targeting safety-critical control tasks...
Why iCloud Fails: The Category Mistake of Cloud Synchronization

Paul Borrill

从哲学角度分析云同步的"范畴错误"，揭示FITO假设的根本局限。
分布式系统云同步理论
iCloud Drive presents a filesystem interface but implements cloud synchronization semantics that diverge from POSIX in fundamental ways. This divergence is not an implementation bug; it is a Category Mistake -- the same one that pervades distributed computing wherever Forward-In-Time-Only (FITO) assumptions are embedded into protocol design...

技术关键词分布

AI Agent × 5 实时调度 × 3 内存管理 × 2 CXL NUMA KV缓存 GPU调度量子计算 Unikernel ASLR 强化学习 Transformer DAG 状态演化确定性系统分布式同步 LLM训练侧信道攻击多租户 Copy-on-Write

🌏 Bluo Blog

关于本站

文章列表

数据统计

ARXIV CS OS 202602 REPORT