๐ด High Significance
Model Releases
๐ด AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration โ score 85
Sources: huggingface
Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby
๐ด No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs โ score 75
Sources: huggingface
This work stems from prior complementary observations on the dynamics of Chain-of-Thought (CoT): Large Language Models (LLMs) is shown latent planning of subsequent reasoning prior to CoT emergence, thereby diminishing the significance of explicit CoT; whereas CoT remains critical for tasks requirin
Infrastructure & Compute
๐ด CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding โ score 95
Sources: huggingface
Large Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based paradigm that treats source code as a linear sequence of tokens,
๐ก Notable
Developer Tools
๐ก MARS: Modular Agent with Reflective Search for Automated AI Research โ score 60
Sources: huggingface
Automating AI research differs from general software engineering due to computationally expensive evaluation (e.g., model training) and opaque performance attribution. Current LLM-based agents struggle here, often generating monolithic scripts that ignore execution costs and causal factors. We intro
๐ก 3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation โ score 60
Sources: huggingface
Existing methods for human motion control in video generation typically rely on either 2D poses or explicit 3D parametric models (e.g., SMPL) as control signals. However, 2D poses rigidly bind motion to the driving viewpoint, precluding novel-view synthesis. Explicit 3D models, though structurally i
๐ก Unlocking the Codex harness: how we built the App Server โ score 50
Sources: lab_blog/OpenAI
Learn how to embed the Codex agent using the Codex App Server, a bidirectional JSON-RPC API powering streaming progress, tool use, approvals, and diffs.
๐ก daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently โ score 45
Sources: huggingface
While Large Language Models (LLMs) excel at short-term tasks, scaling them to long-horizon agentic workflows remains challenging. The core bottleneck lies in the scarcity of training data that captures authentic long-dependency structures and cross-stage evolutionary dynamics--existing synthesis met
๐ข Incremental
Model Releases
๐ข SWE-World: Building Software Engineering Agents in Docker-Free Environments โ score 15
Sources: huggingface
Recent advances in large language models (LLMs) have enabled software engineering agents to tackle complex code modification tasks. Most existing approaches rely on execution feedback from containerized environments, which require dependency-complete setup and physical execution of programs and test
๐ข SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training โ score 5
Sources: huggingface
In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, lon
Developer Tools
๐ข Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks โ score 35
Sources: huggingface
World models have emerged as a critical frontier in AI research, aiming to enhance large models by infusing them with physical dynamics and world knowledge. The core objective is to enable agents to understand, predict, and interact with complex environments. However, current research landscape rema
๐ข Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis โ score 25
Sources: huggingface
Distribution matching distillation (DMD) aligns a multi-step generator with its few-step counterpart to enable high-quality generation under low inference cost. However, DMD tends to suffer from mode collapse, as its reverse-KL formulation inherently encourages mode-seeking behavior, for which exist
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding | infrastructure | 101 | Open |
| AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration | model_release | 94 | Open |
| No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs | model_release | 76 | Open |
| MARS: Modular Agent with Reflective Search for Automated AI Research | developer_tool | 72 | Open |
| 3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation | developer_tool | 72 | Open |
| Temporal Pair Consistency for Variance-Reduced Flow Matching | cs.AI | 0 | Open |
| A computational account of dreaming: learning and memory consolidation | cs.AI | 0 | Open |
| Interfaze: The Future of AI is built on Task-Specific Small Models | cs.AI | 0 | Open |
| DMS2F-HAD: A Dual-branch Mamba-based Spatial-Spectral Fusion Network for Hyperspectral Anomaly Detection | cs.AI | 0 | Open |
| Tinker Tales: Supporting Child-AI Collaboration through Co-Creative Storytelling with Educational Scaffolding | cs.AI | 0 | Open |
| ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agents | cs.AI | 0 | Open |
| Toward Effective Multimodal Graph Foundation Model: A Divide-and-Conquer Based Approach | cs.AI | 0 | Open |
| A logical re-conception of neural networks: Hamiltonian bitwise part-whole architecture | cs.AI | 0 | Open |
| Scalable Explainability-as-a-Service (XaaS) for Edge AI Systems | cs.AI | 0 | Open |
| From Lemmas to Dependencies: What Signals Drive Light Verbs Classification? | cs.AI | 0 | Open |