arXiv cs.AI 周报
人工智能领域最新研究进展
📊 研究方向热度分析
该研究方向持续发展,产出多篇高质量论文。
- SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Understanding
- Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes
- Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
- Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures
- AI Scientist via Synthetic Task Scaling
该研究方向持续发展,产出多篇高质量论文。
- Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift
- ASDA: Automated Skill Distillation and Adaptation for Financial Reasoning
- NeSy-Route: A Neuro-Symbolic Benchmark for Constrained Route Planning in Remote Sensing
- Can LLMs Model Incorrect Student Reasoning? A Case Study on Distractor Generation
- Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
该研究方向持续发展,产出多篇高质量论文。
- Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
- AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation
- Detecting the Machine: A Comprehensive Benchmark of AI-Generated Text Detectors Across Architectures, Domains, and Adversarial Conditions
- Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework
- Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
该研究方向持续发展,产出多篇高质量论文。
- Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift
- ASDA: Automated Skill Distillation and Adaptation for Financial Reasoning
- Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization
- Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes
- Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
该研究方向持续发展,产出多篇高质量论文。
- HIPO: Instruction Hierarchy via Constrained Reinforcement Learning
- How Log-Barrier Helps Exploration in Policy Optimization
- From Digital Twins to World Models:Opportunities, Challenges, and Applications for Mobile Edge General Intelligence
- CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents
- Resilience Meets Autonomy: Governing Embodied AI in Critical Infrastructure
该研究方向持续发展,产出多篇高质量论文。
- NeSy-Route: A Neuro-Symbolic Benchmark for Constrained Route Planning in Remote Sensing
- AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation
- Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
- Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
- Multimodal Connectome Fusion via Cross-Attention for Autism Spectrum Disorder Classification Using Graph Learning
该研究方向持续发展,产出多篇高质量论文。
- The Phasor Transformer: Resolving Attention Bottlenecks on the Unit Circle
- Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
- Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures
- OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation
- Token Coherence: Adapting MESI Cache Protocols to Minimize Synchronization Overhead in Multi-Agent LLM Systems
该研究方向持续发展,产出多篇高质量论文。
- Functorial Neural Architectures from Higher Inductive Types
- Interact3D: Compositional 3D Generation of Interactive Objects
- POaaS: Minimal-Edit Prompt Optimization as a Service to Lift Accuracy and Cut Hallucinations on On-Device sLLMs
- FederatedFactory: Generative One-Shot Learning for Extremely Non-IID Distributed Scenarios
- Prompt Engineering for Scale Development in Generative Psychometrics
该研究方向持续发展,产出多篇高质量论文。
- Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes
- AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation
- Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework
- SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing
- Resilience Meets Autonomy: Governing Embodied AI in Critical Infrastructure
该研究方向持续发展,产出多篇高质量论文。
- Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift
- Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
- Pathology-Aware Multi-View Contrastive Learning for Patient-Independent ECG Reconstruction
- Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework
- Federated Learning for Privacy-Preserving Medical AI
该研究方向持续发展,产出多篇高质量论文。
- SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Understanding
- Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
- Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents
- SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?
- Exploring different approaches to customize language models for domain-specific text-to-code generation
该研究方向持续发展,产出多篇高质量论文。
- AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation
- Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents
- Exploring different approaches to customize language models for domain-specific text-to-code generation
- Prompt Programming for Cultural Bias and Alignment of Large Language Models
- MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model
🔥 本周亮点论文
TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems
TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems
Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models
Directional Embedding Smoothing for Robust Vision Language Models
MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?
👥 作者合作网络
节点大小表示论文数量,连线表示合作关系
💡 技术创新趋势
🤖 智能体架构演进
多智能体协作、工具链整合、长期记忆机制成为研究热点,推动 AI Agent 向更复杂任务场景拓展。
🧠 推理能力提升
思维链优化、多步推理增强、逻辑演绎能力持续改进,大模型复杂问题求解能力显著提升。
🛡️ 安全对齐研究
模型安全、对抗防御、价值对齐研究持续深化,确保 AI 系统可靠可控成为关键议题。
⚡ 效率优化突破
量化压缩、推理加速、轻量化部署技术持续进步,推动大模型在边缘设备落地应用。
评论