Bluo Blog

arXiv cs.AI Weekly Report

Total: 609 papersDate: 2026-03-30 to 2026-04-05Category: cs.AI

Research Hot Topics

📁 Other (163 papers)

Other research directions

The Ultimate Tutorial for AI-driven Scale Development in Generative Ps
Lara Russell-Lasalandra, Hudson Golino, Luis Eduardo Garrido et al.
Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL
Zihao Xu, Xiao Cheng, Ruijie Meng et al.
A First Step Towards Even More Sparse Encodings of Probability Distrib
Florian Andreas Marwitz, Tanya Braun, Ralf Möller
KEditVis: A Visual Analytics System for Knowledge Editing of Large Lan
Zhenning Chen, Hanbei Zhan, Yanwei Huang et al.
Baby Scale: Investigating Models Trained on Individual Children's Lang
Steven Y. Feng, Alvin W. M. Tan, Michael C. Frank

🤖 LLM Agents (155 papers)

Agent systems, multi-agent collaboration, autonomous decision making

CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-Wo
Yi Yu, Guangquan Hu, Chenghuang Shen et al.
FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allo
Tiantian Wang, Xiang Xiang, Simon S. Du
COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Po
Alkis Sygkounas, Rishi Hazra, Andreas Persson et al.
Deep Research of Deep Research: From Transformer to Agent, From AI to
Yipeng Yu
CoE: Collaborative Entropy for Uncertainty Quantification in Agentic M
Kangkang Sun, Jun Wu, Jianhua Li et al.

📚 RAG Memory (84 papers)

Retrieval augmentation, dynamic indexing

Mapping data literacy trajectories in K-12 education
Robert Whyte, Manni Cheung, Katharine Childs et al.
Real-Time Band-Grouped Vocal Denoising Using Sigmoid-Driven Ideal Rati
Daniel Williams
MacTok: Robust Continuous Tokenization for Image Generation
Hengyu Zeng, Xin Gao, Guanghao Li et al.
Quantifying Cross-Modal Interactions in Multimodal Glioma Survival Pre
Iain Swift, JingHua Ye, Ruairi O'Reilly
To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraini
Karan Singh, Michael Yu, Varun Gangal et al.

🧠 Reasoning (75 papers)

Chain-of-thought, multi-step reasoning

Beyond the Answer: Decoding the Behavior of LLMs as Scientific Reasone
Rohan Pandey, Eric Ye, Michael Li
Hallucination-aware intermediate representation edit in large vision-l
Wei Suo, Hanzu Zhang, Lijun Zhang et al.
SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptat
Yuang Wei, Ruijia Li, Bo Jiang
SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topolog
Yifan Wang, Bolian Li, David Cho et al.
CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in
Yongkang Du, Xiaohan Zou, Minhao Cheng et al.

🛡 Safety Alignment (68 papers)

Model safety, privacy, adversarial defense

Why Aggregate Accuracy is Inadequate for Evaluating Fairness in Law En
Khalid Adnan Alsayed
CIPHER: Counterfeit Image Pattern High-level Examination via Represent
Kyeonghun Kim, Youngung Han, Seoyoung Ju et al.
FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Match
Ruiyang Wang, Rong Pan, Zhengan Yao
Adversarial Attacks on Multimodal Large Language Models: A Comprehensi
Bhavuk Jain, Sercan Ö. Arık, Hardeo K. Thakur
CIPHER: Counterfeit Image Pattern High-level Examination via Represent
Kyeonghun Kim, Youngung Han, Seoyoung Ju et al.

🔋 Efficiency (36 papers)

Quantization, pruning, acceleration

Dynamic Lookahead Distance via Reinforcement Learning-Based Pure Pursu
Mohamed Elgouhary, Amr S. El-Wakeel
Optimizing Donor Outreach for Blood Collection Sessions: A Scalable De
André Carneiro, Pedro T. Monteiro, Rui Henriques
End-to-End Image Compression with Segmentation Guided Dual Coding for
Raül Pérez-Gonzalo, Andreas Espersen, Søren Forchhammer et al.
Flow-based Policy With Distributional Reinforcement Learning in Trajec
Ruijie Hao, Longfei Zhang, Yang Dai et al.
Prompt-Guided Prefiltering for VLM Image Compression
Bardia Azizian, Ivan V. Bajic

Featured Papers

Why Aggregate Accuracy is Inadequate for Evaluating Fairness in Law Enforcement Facial Recognition Systems

Khalid Adnan Alsayed

Facial recognition systems are increasingly deployed in law enforcement and security contexts, where algorithmic decisions can carry significant societal consequences. Despite high reported accuracy, growing evidence demonstrates that such systems often exhibit uneven performance across demographic ...

The Ultimate Tutorial for AI-driven Scale Development in Generative Psychometrics: Releasing AIGENIE from its Bottle

Lara Russell-Lasalandra, Hudson Golino, Luis Eduardo Garrido, Alexander P. Christensen

Psychological scale development has traditionally required extensive expert involvement, iterative revision, and large-scale pilot testing before psychometric evaluation can begin. The `AIGENIE` R package implements the AI-GENIE framework (Automatic Item Generation with Network-Integrated Evaluation...

Dynamic Lookahead Distance via Reinforcement Learning-Based Pure Pursuit for Autonomous Racing

Mohamed Elgouhary, Amr S. El-Wakeel

Pure Pursuit (PP) is a widely used path-tracking algorithm in autonomous vehicles due to its simplicity and real-time performance. However, its effectiveness is sensitive to the choice of lookahead distance: shorter values improve cornering but can cause instability on straights, while longer values...

CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments

Yi Yu, Guangquan Hu, Chenghuang Shen, Xingyan Liu, Jing Gu et al.

The increasing agentic capabilities of Large Language Models (LLMs) have enabled their deployment in real-world applications, such as cloud services, where customer-assistant interactions exhibit high technical complexity and long-horizon dependencies, making robustness and resolution efficiency cri...

FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation

Tiantian Wang, Xiang Xiang, Simon S. Du

In federated healthcare systems, Federated Class-Incremental Learning (FCIL) has emerged as a key paradigm, enabling continuous adaptive model learning among distributed clients while safeguarding data privacy. However, in practical applications, data across agent nodes within the distributed framew...

COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game

Alkis Sygkounas, Rishi Hazra, Andreas Persson, Pedro Zuidberg Dos Martires, Amy Loutfi

A central challenge in building continually improving agents is that training environments are typically static or manually constructed. This restricts continual learning and generalization beyond the training distribution. We address this with COvolve, a co-evolutionary framework that leverages lar...

Deep Research of Deep Research: From Transformer to Agent, From AI to AI for Science

Yipeng Yu

With the advancement of large language models (LLMs) in their knowledge base and reasoning capabilities, their interactive modalities have evolved from pure text to multimodality and further to agentic tool use. Consequently, their applications have broadened from question answering to AI assistants...

CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

Kangkang Sun, Jun Wu, Jianhua Li, Minyi Guo, Xiuzhen Che et al.

Uncertainty estimation in multi-LLM systems remains largely single-model-centric: existing methods quantify uncertainty within each model but do not adequately capture semantic disagreement across models. To address this gap, we propose Collaborative Entropy (CoE), a unified information-theoretic me...

Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

Zihao Xu, Xiao Cheng, Ruijie Meng, Yuekang Li

LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program analysis can cross: runtime values enter a natural-language prompt, undergo opaque processing inside the LLM, and re-emerge as code, SQL, JSON, or text that the program consumes. Every anal...

Mapping data literacy trajectories in K-12 education

Robert Whyte, Manni Cheung, Katharine Childs, Jane Waite, Sue Sentance

Data literacy skills are fundamental in computer science education. However, understanding how data-driven systems work represents a paradigm shift from traditional rule-based programming. We conducted a systematic literature review of 84 studies to understand K-12 learners' engagement with data acr...

Top Authors

Kyeonghun Kim (5 papers)
Nam-Joon Kim (5 papers)
Hyuk-Jae Lee (5 papers)
Payal Fofadiya (5 papers)
Sunil Tiwari (5 papers)
Lei Wang (5 papers)
Youngung Han (4 papers)
Steven Y. Feng (4 papers)

Trends

Agent collaboration and autonomous decision making
Safety and alignment deep exploration
Reasoning capability breakthroughs
RAG architecture innovations
RL new paradigms (GRPO, policy distillation)

🌏 Bluo Blog

关于本站

文章列表

数据统计

ARXIV CS AI 20260405