๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration โ€” score 85 Sources: huggingface

Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby

๐Ÿ”ด No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs โ€” score 75 Sources: huggingface

This work stems from prior complementary observations on the dynamics of Chain-of-Thought (CoT): Large Language Models (LLMs) is shown latent planning of subsequent reasoning prior to CoT emergence, thereby diminishing the significance of explicit CoT; whereas CoT remains critical for tasks requirin

Infrastructure & Compute

๐Ÿ”ด CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding โ€” score 95 Sources: huggingface

Large Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based paradigm that treats source code as a linear sequence of tokens,

๐ŸŸก Notable

Developer Tools

๐ŸŸก MARS: Modular Agent with Reflective Search for Automated AI Research โ€” score 60 Sources: huggingface

Automating AI research differs from general software engineering due to computationally expensive evaluation (e.g., model training) and opaque performance attribution. Current LLM-based agents struggle here, often generating monolithic scripts that ignore execution costs and causal factors. We intro

๐ŸŸก 3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation โ€” score 60 Sources: huggingface

Existing methods for human motion control in video generation typically rely on either 2D poses or explicit 3D parametric models (e.g., SMPL) as control signals. However, 2D poses rigidly bind motion to the driving viewpoint, precluding novel-view synthesis. Explicit 3D models, though structurally i

๐ŸŸก Unlocking the Codex harness: how we built the App Server โ€” score 50 Sources: lab_blog/OpenAI

Learn how to embed the Codex agent using the Codex App Server, a bidirectional JSON-RPC API powering streaming progress, tool use, approvals, and diffs.

๐ŸŸก daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently โ€” score 45 Sources: huggingface

While Large Language Models (LLMs) excel at short-term tasks, scaling them to long-horizon agentic workflows remains challenging. The core bottleneck lies in the scarcity of training data that captures authentic long-dependency structures and cross-stage evolutionary dynamics--existing synthesis met

๐ŸŸข Incremental

Model Releases

๐ŸŸข SWE-World: Building Software Engineering Agents in Docker-Free Environments โ€” score 15 Sources: huggingface

Recent advances in large language models (LLMs) have enabled software engineering agents to tackle complex code modification tasks. Most existing approaches rely on execution feedback from containerized environments, which require dependency-complete setup and physical execution of programs and test

๐ŸŸข SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training โ€” score 5 Sources: huggingface

In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, lon

Developer Tools

๐ŸŸข Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks โ€” score 35 Sources: huggingface

World models have emerged as a critical frontier in AI research, aiming to enhance large models by infusing them with physical dynamics and world knowledge. The core objective is to enable agents to understand, predict, and interact with complex environments. However, current research landscape rema

๐ŸŸข Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis โ€” score 25 Sources: huggingface

Distribution matching distillation (DMD) aligns a multi-step generator with its few-step counterpart to enable high-quality generation under low inference cost. However, DMD tends to suffer from mode collapse, as its reverse-KL formulation inherently encourages mode-seeking behavior, for which exist

๐Ÿ“„ New Papers

TitleCategoryScoreLink
CodeOCR: On the Effectiveness of Vision Language Models in Code Understandinginfrastructure101Open
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestrationmodel_release94Open
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMsmodel_release76Open
MARS: Modular Agent with Reflective Search for Automated AI Researchdeveloper_tool72Open
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generationdeveloper_tool72Open
Temporal Pair Consistency for Variance-Reduced Flow Matchingcs.AI0Open
A computational account of dreaming: learning and memory consolidationcs.AI0Open
Interfaze: The Future of AI is built on Task-Specific Small Modelscs.AI0Open
DMS2F-HAD: A Dual-branch Mamba-based Spatial-Spectral Fusion Network for Hyperspectral Anomaly Detectioncs.AI0Open
Tinker Tales: Supporting Child-AI Collaboration through Co-Creative Storytelling with Educational Scaffoldingcs.AI0Open
ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agentscs.AI0Open
Toward Effective Multimodal Graph Foundation Model: A Divide-and-Conquer Based Approachcs.AI0Open
A logical re-conception of neural networks: Hamiltonian bitwise part-whole architecturecs.AI0Open
Scalable Explainability-as-a-Service (XaaS) for Edge AI Systemscs.AI0Open
From Lemmas to Dependencies: What Signals Drive Light Verbs Classification?cs.AI0Open

๐Ÿข Lab Blog Posts