AW · AI Watchtower

🔴 High Significance

Developer Tools

🔴 SLA2: Sparse-Linear Attention with Learnable Routing and QAT — score 95 Sources: huggingface

Sparse-Linear Attention (SLA) combines sparse and linear attention to accelerate diffusion models and has shown strong performance in video generation. However, (i) SLA relies on a heuristic split that assigns computations to the sparse or linear branch based on attention-weight magnitude, which can

🔴 AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines — score 85 Sources: huggingface

The performance of autonomous Web GUI agents heavily relies on the quality and quantity of their training data. However, a fundamental bottleneck persists: collecting interaction trajectories from real-world websites is expensive and difficult to verify. The underlying state transitions are hidden,

🔴 RynnBrain: Open Embodied Foundation Models — score 75 Sources: huggingface

Despite rapid progress in multimodal foundation models, embodied intelligence community still lacks a unified, physically grounded foundation model that integrates perception, reasoning, and planning within real-world spatial-temporal dynamics. We introduce RynnBrain, an open-source spatiotemporal f

🟡 Notable

Model Releases

🟡 Gemini 3.1 Pro: A smarter model for your most complex tasks — score 50 Sources: lab_blog/DeepMind

3.1 Pro is designed for tasks where a simple answer isn’t enough.

Developer Tools

🟡 CADEvolve: Creating Realistic CAD via Program Evolution — score 65 Sources: huggingface

Computer-Aided Design (CAD) delivers rapid, editable modeling for engineering and manufacturing. Recent AI progress now makes full automation feasible for various CAD tasks. However, progress is bottlenecked by data: public corpora mostly contain sketch-extrude sequences, lack complex operations, mu

🟡 Multi-agent cooperation through in-context co-player inference — score 45 Sources: huggingface

Achieving cooperation among self-interested agents remains a fundamental challenge in multi-agent reinforcement learning. Recent work showed that mutual cooperation can be induced between "learning-aware" agents that account for and shape the learning dynamics of their co-players. However, existing

Infrastructure & Compute

🟡 Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation — score 55 Sources: huggingface

Visual loco-manipulation of arbitrary objects in the wild with humanoid robots requires accurate end-effector (EE) control and a generalizable understanding of the scene via visual inputs (e.g., RGB-D images). Existing approaches are based on real-world imitation learning and exhibit limited general

Business & Funding

🟡 Advancing independent research on AI alignment — score 50 Sources: lab_blog/OpenAI

OpenAI commits $7.5M to The Alignment Project to fund independent AI alignment research, strengthening global efforts to address AGI safety and security risks.

🟢 Incremental

Model Releases

🟢 MAEB: Massive Audio Embedding Benchmark — score 35 Sources: huggingface

We introduce the Massive Audio Embedding Benchmark (MAEB), a large-scale benchmark covering 30 tasks across speech, music, environmental sounds, and cross-modal audio-text reasoning in 100+ languages. We evaluate 50+ models and find that no single model dominates across all tasks: contrastive audio-

🟢 Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality — score 25 Sources: huggingface

Standard factuality evaluations of LLMs treat all errors alike, obscuring whether failures arise from missing knowledge (empty shelves) or from limited access to encoded facts (lost keys). We propose a behavioral framework that profiles factual knowledge at the level of facts rather than questions,

Developer Tools

🟢 World Action Models are Zero-shot Policies — score 15 Sources: huggingface

State-of-the-art Vision-Language-Action (VLA) models excel at semantic generalization but struggle to generalize to unseen physical motions in novel environments. We introduce DreamZero, a World Action Model (WAM) built upon a pretrained video diffusion backbone. Unlike VLAs, WAMs learn physical dyn

🟢 Towards a Science of AI Agent Reliability — score 5 Sources: huggingface

AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many agents still continue to fail in practice. This discrepancy highlights a fundamental limitation of current evaluations: compressing agent behavior into a s

📄 New Papers

Title	Category	Score	Link
SLA2: Sparse-Linear Attention with Learnable Routing and QAT	developer_tool	63	Open
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines	developer_tool	53	Open
RynnBrain: Open Embodied Foundation Models	developer_tool	49	Open
CADEvolve: Creating Realistic CAD via Program Evolution	developer_tool	32	Open
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation	infrastructure	29	Open
A Unified Framework for Locality in Scalable MARL	cs.AI	0	Open
Early-Warning Signals of Grokking via Loss-Landscape Geometry	cs.AI	0	Open
DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers	cs.AI	0	Open
HQFS: Hybrid Quantum Classical Financial Security with VQC Forecasting, QUBO Annealing, and Audit-Ready Post-Quantum Signing	cs.AI	0	Open
Fundamental Limits of Black-Box Safety Evaluation: Information-Theoretic and Computational Barriers from Latent Context Conditioning	cs.AI	0	Open
Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation	cs.AI	0	Open
Exploring LLMs for User Story Extraction from Mockups	cs.AI	0	Open
Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases	cs.AI	0	Open
Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User History	cs.AI	0	Open
OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration	cs.AI	0	Open

🏢 Lab Blog Posts

OpenAI: Advancing independent research on AI alignment
DeepMind: Gemini 3.1 Pro: A smarter model for your most complex tasks

AI Watchtower Briefing — 2026-02-19

🔴 High Significance

Developer Tools

🟡 Notable

Model Releases

Developer Tools

Infrastructure & Compute

Business & Funding

🟢 Incremental

Model Releases

Developer Tools

📄 New Papers

🏢 Lab Blog Posts