arXiv cs.AI 周报 (20260322)

arXiv cs.AI 周报

人工智能领域最新研究进展

📚 共 728 篇论文 📅 时间范围: 2026年3月16日～ 2026年3月22日 🏷️ 分类: cs.AI

📊 研究方向热度分析

🤖 LLM Agents 193篇

该研究方向持续发展，产出多篇高质量论文。

🧠 Reasoning 191篇

该研究方向持续发展，产出多篇高质量论文。

🛡️ Safety & Alignment 147篇

该研究方向持续发展，产出多篇高质量论文。

⚡ Efficiency 146篇

该研究方向持续发展，产出多篇高质量论文。

🎮 Reinforcement Learning 144篇

该研究方向持续发展，产出多篇高质量论文。

👁️ Vision-Language 138篇

该研究方向持续发展，产出多篇高质量论文。

📚 RAG & Memory 118篇

该研究方向持续发展，产出多篇高质量论文。

📁 Other 108篇

该研究方向持续发展，产出多篇高质量论文。

🦾 Robotics & Embodied 61篇

该研究方向持续发展，产出多篇高质量论文。

🏥 Healthcare 52篇

该研究方向持续发展，产出多篇高质量论文。

💻 Code 43篇

该研究方向持续发展，产出多篇高质量论文。

📝 NLP 41篇

该研究方向持续发展，产出多篇高质量论文。

🔥 本周亮点论文

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Kai Wang, Biaojie Zeng, Zeming Wei, Chang Jin, Hefeng Zhou et al.

With the rapid development of LLM-based multi-agent systems (MAS), their significant safety and security concerns have emerged, which introduce novel risks going beyond single agents or LLMs. Despite attempts to address these issues, the existing literature lacks a cohesive safeguarding system specialized for MAS risks. In this work, we introduce TrinityGuard, a comprehensive safety evaluation and monitoring framework for LLM-based MAS, grounded in the OWASP standards. Specifically, TrinityGuard...

LLM AgentsSafety & Alignment

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Kai Wang, Biaojie Zeng, Zeming Wei, Chang Jin, Hefeng Zhou et al.

LLM AgentsSafety & Alignment

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

Ahmed Sharshar, Hosam Elgendy, Saad El Dine Ahmed, Yasser Rohaim, Yuxia Wang

Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span English, Arabic, and language-independent (universal...

ReasoningSafety & AlignmentVision-Language

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

Ahmed Sharshar, Hosam Elgendy, Saad El Dine Ahmed, Yasser Rohaim, Yuxia Wang

ReasoningSafety & AlignmentVision-Language

UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models

Segyu Lee, Boryeong Cho, Hojung Jung, Seokhyun An, Juhyeong Kim et al.

Unified Multimodal Models (UMMs) offer powerful cross-modality capabilities but introduce new safety risks not observed in single-task models. Despite their emergence, existing safety benchmarks remain fragmented across tasks and modalities, limiting the comprehensive evaluation of complex system-level vulnerabilities. To address this gap, we introduce UniSAFE, the first comprehensive benchmark for system-level safety evaluation of UMMs across 7 I/O modality combinations, spanning conventional t...

Safety & AlignmentRAG & MemoryVision-Language

Directional Embedding Smoothing for Robust Vision Language Models

Ye Wang, Jing Liu, Toshiaki Koike-Akino

The safety and reliability of vision-language models (VLMs) are a crucial part of deploying trustworthy agentic AI systems. However, VLMs remain vulnerable to jailbreaking attacks that undermine their safety alignment to yield harmful outputs. In this work, we extend the Randomized Embedding Smoothing and Token Aggregation (RESTA) defense to VLMs and evaluate its performance against the JailBreakV-28K benchmark of multi-modal jailbreaking attacks. We find that RESTA is effective in reducing atta...

LLM AgentsSafety & AlignmentVision-Language

MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model

Youngwan Lee, Soojin Jang, Yoorhim Cho, Seunghwan Lee, Yong-Ju Lee et al.

Spatial reasoning is foundational for Vision-Language Models (VLMs), particularly when deployed as Vision-Language-Action (VLA) agents in physical environments. However, existing benchmarks predominantly focus on elementary, single-hop relations, neglecting the multi-hop compositional reasoning and precise visual grounding essential for real-world scenarios. To address this, we introduce MultihopSpatial, offering three key contributions: (1) A comprehensive benchmark designed for multi-hop and c...

LLM AgentsReasoningVision-Language

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

Tingxu Han, Yi Zhang, Wei Song, Chunrong Fang, Zhenyu Chen et al.

Agent skills, structured procedural knowledge packages injected at inference time, are increasingly used to augment LLM agents on software engineering tasks. However, their real utility in end-to-end development settings remains unclear. We present SWE-Skills-Bench, the first requirement-driven benchmark that isolates the marginal utility of agent skills in real-world software engineering (SWE). It pairs 49 public SWE skills with authentic GitHub repositories pinned at fixed commits and requirem...

LLM AgentsReasoningEfficiency

👥 作者合作网络

节点大小表示论文数量，连线表示合作关系

💡 技术创新趋势

🤖 智能体架构演进

多智能体协作、工具链整合、长期记忆机制成为研究热点，推动 AI Agent 向更复杂任务场景拓展。

🧠 推理能力提升

思维链优化、多步推理增强、逻辑演绎能力持续改进，大模型复杂问题求解能力显著提升。

🛡️ 安全对齐研究

模型安全、对抗防御、价值对齐研究持续深化，确保 AI 系统可靠可控成为关键议题。

🌏 Bluo Blog

关于本站

文章列表

数据统计

ARXIV CS AI 20260322

arXiv cs.AI 周报

📊 研究方向热度分析

🔥 本周亮点论文

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models

Directional Embedding Smoothing for Robust Vision Language Models

MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

👥 作者合作网络

💡 技术创新趋势

🤖 智能体架构演进

🧠 推理能力提升

🛡️ 安全对齐研究

⚡ 效率优化突破

评论