arXiv cs.OS 20260301 到 20260331 论文分析报告
📊 数据统计概览
📈基本统计
- 论文总数: 24
- 分析分类: cs.OS
- 时间范围: 20260301 到 20260331
- 独立作者数: 81
👥高产作者 Top 10
- Jianshu She (1 篇)
- Yeasir Rayhan (1 篇)
- Walid G. Aref (1 篇)
- Zhenyuan Yang (1 篇)
- Wenxin Zheng (1 篇)
- Mingyu Li (1 篇)
- Hao Ke (1 篇)
- José Luis Conradi Hoffmann (1 篇)
- Antônio Augusto Fröhlich (1 篇)
- Tony Mason (1 篇)
🔍热门关键词 Top 10
- memory (18 次)
- data (13 次)
- scheduling (9 次)
- large (8 次)
- kernel (8 次)
- operating (7 次)
- resource (5 次)
- cache (5 次)
- hardware (5 次)
- execution (5 次)
🤖 AI 深度分析
arXiv cs.OS Research Analysis
A Report on Papers Published from 2026-03-01 to 2026-03-31
1. Introduction
This report provides a detailed analysis of 24 research papers published in the Computer Science - Operating Systems (cs.OS) category on arXiv between March 1, 2026, and March 31, 2026. The goal is to identify emerging research hotspots, understand the author collaboration landscape, and summarize key technological innovations. The analysis reveals a strong convergence between operating systems research and AI/LLM systems, with significant attention to memory management, GPU scheduling, and security in emerging computing paradigms.
2. Research Hotspots and Trends
The papers analyzed cluster around several key areas. We identified seven major research hotspots, which highlight the current priorities and future directions of operating systems research.
LLM Agent Systems & OS Abstractions
3 Papers
Core Focus: Bridging the gap between LLM agent frameworks and traditional operating system concepts. Researchers are applying decades of OS wisdom—resource management, process isolation, scheduling—to the new domain of AI agents.
Key Innovations: "AgentRM" proposes an OS-inspired resource manager addressing scheduling failures and memory leaks in LLM agent systems. "Quine" realizes LLM agents as native POSIX processes, mapping identity to PID, state to memory, and communication to standard streams. "Pichay" introduces demand paging for LLM context windows, treating context as L1 cache with proper memory hierarchy.
Future Trend: LLM agents will increasingly be treated as "processes" with proper OS-level resource management. Expect more work on agent-level scheduling, memory management, and inter-agent communication protocols that leverage existing OS primitives rather than reinventing them at the application layer.
GPU Sharing & Scheduling
3 Papers
Core Focus: Maximizing GPU utilization while maintaining performance predictability and semantic correctness. As AI workloads dominate data centers, GPU resource management has become a critical systems problem.
Key Innovations: "DetShare" guarantees semantic and performance determinism in flexible GPU sharing without invasive kernel modifications. "LMetric" proposes a surprisingly simple multiplication-based metric for LLM request scheduling that achieves both KV cache availability and workload balance. "NCCLbpf" embeds eBPF runtime into NCCL for verified, composable policy execution in GPU collective communication.
Future Trend: GPU scheduling will evolve from coarse-grained time slicing to fine-grained, semantically-aware resource allocation. The success of eBPF in kernel extensibility is being replicated in GPU communication frameworks, enabling safer, more flexible policy customization.
Memory Hierarchy & Management
4 Papers
Core Focus: Addressing the memory bottleneck across diverse contexts—from tiered memory architectures to mobile device app launches. Memory remains the primary performance and energy bottleneck in modern systems.
Key Innovations: "Virtual-Memory Assisted Buffer Management" exploits virtual memory for tiered memory database systems. "Mitigating the Memory Bottleneck" uses ML-driven microarchitectural techniques for prefetching and cache management. "AppFlow" provides memory scheduling for cold launch of GB-scale apps on mobile and vehicle systems. The "Missing Memory Hierarchy" paper (Pichay) fundamentally reconceptualizes LLM context windows as a cache hierarchy.
Future Trend: Memory management is becoming increasingly application-aware and ML-guided. The traditional boundary between OS memory management and application-level caching is blurring, with systems that understand workload semantics to make better allocation decisions.
System Security & Trusted Execution
3 Papers
Core Focus: Building secure systems without relying on a trusted software TCB. Hardware-assisted security and capability architectures are enabling new paradigms for isolation and attestation.
Key Innovations: "Trust Nothing" proposes an RTOS security architecture without run-time software TCB using token capabilities. "FlexServe" uses ARM TrustZone with flexible resource isolation for secure LLM serving on mobile devices. "Mica" decouples confidentiality from trust in confidential computing, enabling attestable workflows from distrustful TEE components.
Future Trend: The zero-trust principle is extending deeper into system architecture. Capability-based security and hardware-enforced isolation will become standard building blocks, with systems designed to minimize the trusted computing base to hardware alone.
Embedded & Real-Time Operating Systems
3 Papers
Core Focus: From research to production deployment, embedded OSes are maturing and facing real-world reliability and security challenges.
Key Innovations: "Tock" documents the remarkable journey from a research OS to securing 10 million computers as root of trust hardware. "Experimental Analysis of FreeRTOS Dependability" introduces KRONOS, a fault injection framework for assessing RTOS resilience against radiation-induced faults. "Ensuring Data Freshness" proposes task-based scheduling for multi-rate sensor fusion in safety-critical autonomous systems.
Future Trend: Embedded OS research is transitioning from novel architectures to rigorous validation, security hardening, and deployment at scale. The success of Tock demonstrates that research OSes can achieve significant real-world impact when designed with security as a first principle.
Storage & Caching Systems
4 Papers
Core Focus: Understanding and optimizing storage and caching behavior under diverse workloads, from vector search to programmable caching engines.
Key Innovations: "GateANN" achieves I/O-efficient filtered vector search on SSDs by decoupling graph traversal from vector retrieval. "Idiosyncrasies of Programmable Caching Engines" empirically studies CacheLib behavior under dynamic multi-tenant workloads. "2DIO" creates cache-accurate microbenchmarks with tunable complex cache behaviors. "Wayfinder" automates OS specialization through Bayesian optimization over configuration spaces.
Future Trend: Storage systems are becoming increasingly workload-aware and self-tuning. The trend toward programmable infrastructure (caching engines, vector indexes) demands better understanding of their behavior under realistic conditions and tools for automated optimization.
Network & Distributed Systems
2 Papers
Core Focus: Addressing fundamental challenges in distributed name resolution and transport protocol semantics.
Key Innovations: "Structured Gossip DNS" exploits DHT finger tables for partition-resilient name resolution in MANETs and edge computing, reducing message complexity from O(n) to O(n/log n). "CATS" introduces a conductor-driven asymmetric transport scheme that provides TCP with semantic awareness to prioritize critical content.
Future Trend: Network protocols are evolving from blind data transport to semantic-aware communication. For edge and mobile environments, partition-resilient algorithms that work without active coordination will be essential for reliability.
4. Key Technical Innovations
Across the diverse research topics, several specific, named technologies and frameworks stand out as significant contributions.
- AgentRM
- An OS-inspired resource manager for LLM agent systems that addresses scheduling failures (blocking, zombie processes) and memory management challenges identified through empirical analysis of 40,000 GitHub issues from six major agent frameworks.
- CATS (Conductor-driven Asymmetric Transport Scheme)
- A framework that provides TCP with semantic awareness to prioritize critical content, challenging the traditional blind FIFO conveyor belt model of transport protocols.
- DetShare
- A GPU sharing system that guarantees both semantic determinism (behavioral equivalence) and performance determinism without requiring invasive kernel modifications.
- FlexServe
- A secure LLM serving system for mobile devices that uses ARM TrustZone with flexible resource isolation, addressing the significant overhead of traditional TrustZone protection for LLM inference.
- GateANN
- An I/O-efficient SSD-based graph ANNS system supporting filtered vector search on unmodified graph indexes by decoupling graph traversal from vector retrieval.
- KRONOS
- A software-based, non-intrusive fault injection framework for assessing FreeRTOS dependability against radiation-induced transient and permanent faults.
- LMetric
- A surprisingly simple multiplication-based metric for LLM request scheduling that achieves both KV cache availability and workload balance, demonstrating that "simple is better" for this complex problem.
- Mica
- A confidential computing architecture that decouples confidentiality from trust, enabling attestable and trusted workflows composed from independently developed, potentially distrustful TEE components.
- NCCLbpf
- A verified, high-performance extension framework embedding userspace eBPF runtime into NCCL, bringing kernel-level extensibility safety guarantees to GPU collective communication.
- Pichay
- A demand paging system for LLM context windows that treats context as L1 cache, introducing proper memory hierarchy (L2, virtual memory, paging) to reduce structural waste identified across 857 production sessions.
- Quine
- A runtime architecture that realizes LLM agents as native POSIX processes, with explicit mapping: identity is PID, interface is standard streams, state is memory, and environment is inherited.
- Structured Gossip DNS
- A partition-resilient DNS for MANETs and edge computing that exploits DHT finger tables for passive stabilization, reducing message complexity from O(n) to O(n/log n).
- Tock
- A research-originated OS that has successfully deployed to secure 10 million computers as root of trust hardware, demonstrating the path from academic research to production-scale security systems.
- Wayfinder
- An automated OS specialization system using Bayesian optimization over configuration spaces, addressing the challenge of finding optimal configurations among millions of options.
5. Conclusion
The research landscape of operating systems in March 2026 reveals a profound convergence with AI systems. The most striking trend is the application of decades of OS wisdom—resource management, process abstraction, memory hierarchy, scheduling—to LLM agent systems and GPU infrastructure. Papers like AgentRM, Quine, and Pichay demonstrate that the AI community's reinvention of OS primitives at the application layer can benefit from explicit mapping to native OS concepts.
Memory management remains the central challenge, appearing across multiple contexts: tiered memory architectures, mobile app launches, and fundamentally reconceptualized as a hierarchy for LLM context windows. The "Missing Memory Hierarchy" paper is particularly noteworthy for its paradigm-shifting view of LLM context as L1 cache.
Security research continues to push toward zero-trust architectures, with systems designed to minimize the trusted computing base to hardware alone. The journey of Tock from research OS to securing 10 million computers exemplifies the real-world impact possible when security is designed as a first principle.
The innovations presented in this collection demonstrate an adaptive field actively solving the critical systems-level challenges posed by AI workloads, edge computing, and security in an increasingly hostile threat landscape.
6. Full Paper Index
| Title | Authors | Link |
|---|---|---|
| AgentRM: An OS-Inspired Resource Manager for LLM Agent Systems | Jianshu She | 2603.13110v1 |
| Virtual-Memory Assisted Buffer Management In Tiered Memory | Yeasir Rayhan, Walid G. Aref | 2603.03271v1 |
| Guaranteeing Semantic and Performance Determinism in Flexible GPU Sharing | Zhenyuan Yang, Wenxin Zheng, Mingyu Li | 2603.15042v2 |
| Quine: Realizing LLM Agents as Native POSIX Processes | Hao Ke | 2603.18030v1 |
| Ensuring Data Freshness in Multi-Rate Task Chains Scheduling | José Luis Conradi Hoffmann, Antônio Augusto Fröhlich | 2603.09738v1 |
| The Missing Memory Hierarchy: Demand Paging for LLM Context Windows | Tony Mason | 2603.09023v1 |
| Trust Nothing: RTOS Security without Run-Time Software TCB | Eric Ackermann, Sven Bugiel | 2603.08400v1 |
| Improved Leakage Abuse Attacks in Searchable Symmetric Encryption with eBPF Monitoring | Chinecherem Dimobi | 2603.07030v1 |
| GateANN: I/O-Efficient Filtered Vector Search on SSDs | Nakyung Lee, Soobin Cho, Jiwoong Park, Gyuyeong Kim | 2603.21466v2 |
| Sharing is caring: Attestable and Trusted Workflows out of Distrustful Components | Amir Al Sadi, Sina Abdollahi, Adrien Ghosn, Hamed Haddadi, Marios Kogias | 2603.03403v2 |
| Idiosyncrasies of Programmable Caching Engines | José Peixoto, Alexis Gonzalez, Janki Bhimani, Raju Rangaswami, Cláudia Brito, João Paulo, Ricardo Macedo | 2603.14357v1 |
| 2DIO: A Cache-Accurate Storage Microbenchmark | Yirong Wang, Isaac Khor, Peter Desnoyers | 2603.19971v1 |
| Tock: From Research to Securing 10 Million Computers | Leon Schuermann, Brad Campbell, Branden Ghena, Philip Levis, Amit Levy, Pat Pannuto | 2603.22585v1 |
| Wayfinder: Automated Operating System Specialization | Alexander Jung, Cezar Crăciunoiu, Nikolaos Karaolidis, Hugo Lefeuvre, Daniel Oñoro Rubio, Felipe Huici, Charalampos Rotsos, Pierre Olivier | 2603.23425v1 |
| LMetric: Simple is Better - Multiplication May Be All You Need for LLM Request Scheduling | Dingyan Zhang, Jinbo Han, Kaixi Zhang, Xingda Wei, Sijie Shen, Chenguang Fang, Wenyuan Yu, Jingren Zhou, Rong Chen | 2603.15202v2 |
| Structured Gossip: A Partition-Resilient DNS for Internet-Scale Dynamic Networks | Priyanka Sinha, Dilys Thomas | 2603.07750v1 |
| A Case for CATS: A Conductor-driven Asymmetric Transport Scheme for Semantic Prioritization | Syed Muhammad Aqdas Rizvi | 2603.13945v1 |
| FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation | Yinpeng Wu, Yitong Chen, Lixiang Wang, Jinyu Gu, Zhichao Hua, Yubin Xia | 2603.09046v1 |
| Experimental Analysis of FreeRTOS Dependability through Targeted Fault Injection Campaigns | Luca Mannella, Stefano Di Carlo, Alessandro Savino | 2603.25666v1 |
| Brain-inspired AI for Edge Intelligence: a systematic review | Yingchao Cheng, Meijia Wang, Zhifeng Hao, Rajkumar Buyya | 2603.26722v1 |
| Mitigating the Memory Bottleneck with Machine Learning-Driven and Data-Aware Microarchitectural Techniques | Rahul Bera | 2603.07683v1 |
| AppFlow: Memory Scheduling for Cold Launch of Large Apps on Mobile and Vehicle Systems | Xiaochen Li, Sicong Liu, Bin Guo, Yu Ouyang, Fengmin Wu, Yuan Xu, Zhiwen Yu | 2603.17259v1 |
| Machine Learning (ML) library in Linux kernel | Viacheslav Dubeyko | 2603.02145v1 |
| NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication | Yusheng Zheng | 2603.11438v1 |
评论