AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration — score 85 Sources: huggingface

Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby

🔴 No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs — score 75 Sources: huggingface

This work stems from prior complementary observations on the dynamics of Chain-of-Thought (CoT): Large Language Models (LLMs) is shown latent planning of subsequent reasoning prior to CoT emergence, thereby diminishing the significance of explicit CoT; whereas CoT remains critical for tasks requirin

Infrastructure & Compute

🔴 CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding — score 95 Sources: huggingface

Large Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based paradigm that treats source code as a linear sequence of tokens,

🟡 Notable

Developer Tools

🟡 MARS: Modular Agent with Reflective Search for Automated AI Research — score 60 Sources: huggingface

Automating AI research differs from general software engineering due to computationally expensive evaluation (e.g., model training) and opaque performance attribution. Current LLM-based agents struggle here, often generating monolithic scripts that ignore execution costs and causal factors. We intro

🟡 3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation — score 60 Sources: huggingface

Existing methods for human motion control in video generation typically rely on either 2D poses or explicit 3D parametric models (e.g., SMPL) as control signals. However, 2D poses rigidly bind motion to the driving viewpoint, precluding novel-view synthesis. Explicit 3D models, though structurally i

🟡 Unlocking the Codex harness: how we built the App Server — score 50 Sources: lab_blog/OpenAI

Learn how to embed the Codex agent using the Codex App Server, a bidirectional JSON-RPC API powering streaming progress, tool use, approvals, and diffs.

🟡 daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently — score 45 Sources: huggingface

While Large Language Models (LLMs) excel at short-term tasks, scaling them to long-horizon agentic workflows remains challenging. The core bottleneck lies in the scarcity of training data that captures authentic long-dependency structures and cross-stage evolutionary dynamics--existing synthesis met

🟢 Incremental

Model Releases

🟢 SWE-World: Building Software Engineering Agents in Docker-Free Environments — score 15 Sources: huggingface

Recent advances in large language models (LLMs) have enabled software engineering agents to tackle complex code modification tasks. Most existing approaches rely on execution feedback from containerized environments, which require dependency-complete setup and physical execution of programs and test

🟢 SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training — score 5 Sources: huggingface

In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, lon

Developer Tools

🟢 Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks — score 35 Sources: huggingface

World models have emerged as a critical frontier in AI research, aiming to enhance large models by infusing them with physical dynamics and world knowledge. The core objective is to enable agents to understand, predict, and interact with complex environments. However, current research landscape rema

🟢 Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis — score 25 Sources: huggingface

Distribution matching distillation (DMD) aligns a multi-step generator with its few-step counterpart to enable high-quality generation under low inference cost. However, DMD tends to suffer from mode collapse, as its reverse-KL formulation inherently encourages mode-seeking behavior, for which exist

📄 New Papers

Title	Category	Score	Link
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding	infrastructure	101	Open
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration	model_release	94	Open
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs	model_release	76	Open
MARS: Modular Agent with Reflective Search for Automated AI Research	developer_tool	72	Open
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation	developer_tool	72	Open
Temporal Pair Consistency for Variance-Reduced Flow Matching	cs.AI	0	Open
A computational account of dreaming: learning and memory consolidation	cs.AI	0	Open
Interfaze: The Future of AI is built on Task-Specific Small Models	cs.AI	0	Open
DMS2F-HAD: A Dual-branch Mamba-based Spatial-Spectral Fusion Network for Hyperspectral Anomaly Detection	cs.AI	0	Open
Tinker Tales: Supporting Child-AI Collaboration through Co-Creative Storytelling with Educational Scaffolding	cs.AI	0	Open
ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agents	cs.AI	0	Open
Toward Effective Multimodal Graph Foundation Model: A Divide-and-Conquer Based Approach	cs.AI	0	Open
A logical re-conception of neural networks: Hamiltonian bitwise part-whole architecture	cs.AI	0	Open
Scalable Explainability-as-a-Service (XaaS) for Edge AI Systems	cs.AI	0	Open
From Lemmas to Dependencies: What Signals Drive Light Verbs Classification?	cs.AI	0	Open

🏢 Lab Blog Posts

OpenAI: Unlocking the Codex harness: how we built the App Server

AI Watchtower Briefing — 2026-02-04

🔴 High Significance

Model Releases

Infrastructure & Compute

🟡 Notable

Developer Tools

🟢 Incremental

Model Releases

Developer Tools

📄 New Papers

🏢 Lab Blog Posts