AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 MARS: Enabling Autoregressive Models Multi-Token Generation — score 75 Sources: huggingface

Autoregressive (AR) language models generate text one token at a time, even when consecutive tokens are highly predictable given earlier context. We introduce MARS (Mask AutoRegreSsion), a lightweight fine-tuning method that teaches an instruction-tuned AR model to predict multiple tokens per forwar

Developer Tools

🔴 Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning — score 95 Sources: huggingface

Humans paint images incrementally: they plan a global layout, sketch a coarse draft, inspect, and refine details, and most importantly, each step is grounded in the evolving visual states. However, can unified multimodal models trained on text-image interleaved datasets also imagine the chain of int

🔴 RAGEN-2: Reasoning Collapse in Agentic RL — score 85 Sources: huggingface

RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot tell whether reasoning actually responds to differe

🟡 Notable

Model Releases

🟡 CyberAgent moves faster with ChatGPT Enterprise and Codex — score 50 Sources: lab_blog/OpenAI

CyberAgent uses ChatGPT Enterprise and Codex to securely scale AI adoption, improve quality, and accelerate decisions across advertising, media, and gaming.

Developer Tools

🟡 INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling — score 65 Sources: huggingface

Building world models with spatial consistency and real-time interactivity remains a fundamental challenge in computer vision. Current video generation paradigms often struggle with a lack of spatial persistence and insufficient visual realism, making it difficult to support seamless navigation in c

🟡 FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling — score 55 Sources: huggingface

Reinforcement-Learning-based post-training has recently emerged as a promising paradigm for aligning text-to-image diffusion models with human preferences. In recent studies, increasing the rollout group size yields pronounced performance improvements, indicating substantial room for further alignme

🟡 Combee: Scaling Prompt Learning for Self-Improving Language Model Agents — score 40 Sources: huggingface

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these me

🟡 SEVerA: Verified Synthesis of Self-Evolving Agents — score 40 Sources: huggingface

Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. Howeve

Other Signals

🟡 OpenAI Full Fan Mode Contest: Terms & Conditions — score 50 Sources: lab_blog/OpenAI

Explore the official terms and conditions for the OpenAI Full Fan Mode Contest, including eligibility, entry steps, judging criteria, and prize details. Learn how to participate, submit your entry on Instagram, and win IPL match tickets.

🟢 Incremental

Model Releases

🟢 Qualixar OS: A Universal Operating System for AI Agent Orchestration — score 15 Sources: huggingface

We present Qualixar OS, the first application-layer operating system for universal AI agent orchestration. Unlike kernel-level approaches (AIOS) or single-framework tools (AutoGen, CrewAI), Qualixar OS provides a complete runtime for heterogeneous multi-agent systems spanning 10 LLM providers, 8+ ag

Developer Tools

🟢 Neural Computers — score 25 Sources: huggingface

We propose a new frontier: Neural Computers (NCs) -- an emerging machine form that unifies computation, memory, and I/O in a learned runtime state. Unlike conventional computers, which execute explicit programs, agents, which act over external execution environments, and world models, which learn en

🟢 TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders — score 5 Sources: huggingface

We propose TC-AE, a ViT-based architecture for deep compression autoencoders. Existing methods commonly increase the channel number of latent representations to maintain reconstruction quality under high compression ratios. However, this strategy often leads to latent representation collapse, which

📄 New Papers

Title	Category	Score	Link
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning	developer_tool	75	Open
RAGEN-2: Reasoning Collapse in Agentic RL	developer_tool	70	Open
MARS: Enabling Autoregressive Models Multi-Token Generation	model_release	41	Open
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling	developer_tool	38	Open
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling	developer_tool	35	Open
An Imperfect Verifier is Good Enough: Learning with Noisy Rewards	cs.AI	0	Open
From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation	cs.AI	0	Open
Reinforcement Learning with LLM-Guided Action Spaces for Synthesizable Lead Optimization	cs.AI	0	Open
Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class System	cs.AI	0	Open
Joint Task Offloading, Inference Optimization and UAV Trajectory Planning for Generative AI Empowered Intelligent Transportation Digital Twin	cs.AI	0	Open
AITH: A Post-Quantum Continuous Delegation Protocol for Human-AI Trust Establishment	cs.AI	0	Open
IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures	cs.AI	0	Open
Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models	cs.AI	0	Open
Towards Knowledgeable Deep Research: Framework and Benchmark	cs.AI	0	Open
Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution	cs.AI	0	Open

🏢 Lab Blog Posts

OpenAI: CyberAgent moves faster with ChatGPT Enterprise and Codex
OpenAI: OpenAI Full Fan Mode Contest: Terms & Conditions

AI Watchtower Briefing — 2026-04-09

🔴 High Significance

Model Releases

Developer Tools

🟡 Notable

Model Releases

Developer Tools

Other Signals

🟢 Incremental

Model Releases

Developer Tools

📄 New Papers

🏢 Lab Blog Posts