๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด MARS: Enabling Autoregressive Models Multi-Token Generation โ€” score 75 Sources: huggingface

Autoregressive (AR) language models generate text one token at a time, even when consecutive tokens are highly predictable given earlier context. We introduce MARS (Mask AutoRegreSsion), a lightweight fine-tuning method that teaches an instruction-tuned AR model to predict multiple tokens per forwar

Developer Tools

๐Ÿ”ด Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning โ€” score 95 Sources: huggingface

Humans paint images incrementally: they plan a global layout, sketch a coarse draft, inspect, and refine details, and most importantly, each step is grounded in the evolving visual states. However, can unified multimodal models trained on text-image interleaved datasets also imagine the chain of int

๐Ÿ”ด RAGEN-2: Reasoning Collapse in Agentic RL โ€” score 85 Sources: huggingface

RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot tell whether reasoning actually responds to differe

๐ŸŸก Notable

Model Releases

๐ŸŸก CyberAgent moves faster with ChatGPT Enterprise and Codex โ€” score 50 Sources: lab_blog/OpenAI

CyberAgent uses ChatGPT Enterprise and Codex to securely scale AI adoption, improve quality, and accelerate decisions across advertising, media, and gaming.

Developer Tools

๐ŸŸก INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling โ€” score 65 Sources: huggingface

Building world models with spatial consistency and real-time interactivity remains a fundamental challenge in computer vision. Current video generation paradigms often struggle with a lack of spatial persistence and insufficient visual realism, making it difficult to support seamless navigation in c

๐ŸŸก FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling โ€” score 55 Sources: huggingface

Reinforcement-Learning-based post-training has recently emerged as a promising paradigm for aligning text-to-image diffusion models with human preferences. In recent studies, increasing the rollout group size yields pronounced performance improvements, indicating substantial room for further alignme

๐ŸŸก Combee: Scaling Prompt Learning for Self-Improving Language Model Agents โ€” score 40 Sources: huggingface

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these me

๐ŸŸก SEVerA: Verified Synthesis of Self-Evolving Agents โ€” score 40 Sources: huggingface

Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. Howeve

Other Signals

๐ŸŸก OpenAI Full Fan Mode Contest: Terms & Conditions โ€” score 50 Sources: lab_blog/OpenAI

Explore the official terms and conditions for the OpenAI Full Fan Mode Contest, including eligibility, entry steps, judging criteria, and prize details. Learn how to participate, submit your entry on Instagram, and win IPL match tickets.

๐ŸŸข Incremental

Model Releases

๐ŸŸข Qualixar OS: A Universal Operating System for AI Agent Orchestration โ€” score 15 Sources: huggingface

We present Qualixar OS, the first application-layer operating system for universal AI agent orchestration. Unlike kernel-level approaches (AIOS) or single-framework tools (AutoGen, CrewAI), Qualixar OS provides a complete runtime for heterogeneous multi-agent systems spanning 10 LLM providers, 8+ ag

Developer Tools

๐ŸŸข Neural Computers โ€” score 25 Sources: huggingface

We propose a new frontier: Neural Computers (NCs) -- an emerging machine form that unifies computation, memory, and I/O in a learned runtime state. Unlike conventional computers, which execute explicit programs, agents, which act over external execution environments, and world models, which learn en

๐ŸŸข TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders โ€” score 5 Sources: huggingface

We propose TC-AE, a ViT-based architecture for deep compression autoencoders. Existing methods commonly increase the channel number of latent representations to maintain reconstruction quality under high compression ratios. However, this strategy often leads to latent representation collapse, which

๐Ÿ“„ New Papers

TitleCategoryScoreLink
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoningdeveloper_tool75Open
RAGEN-2: Reasoning Collapse in Agentic RLdeveloper_tool70Open
MARS: Enabling Autoregressive Models Multi-Token Generationmodel_release41Open
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modelingdeveloper_tool38Open
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scalingdeveloper_tool35Open
An Imperfect Verifier is Good Enough: Learning with Noisy Rewardscs.AI0Open
From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberationcs.AI0Open
Reinforcement Learning with LLM-Guided Action Spaces for Synthesizable Lead Optimizationcs.AI0Open
Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class Systemcs.AI0Open
Joint Task Offloading, Inference Optimization and UAV Trajectory Planning for Generative AI Empowered Intelligent Transportation Digital Twincs.AI0Open
AITH: A Post-Quantum Continuous Delegation Protocol for Human-AI Trust Establishmentcs.AI0Open
IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measurescs.AI0Open
Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Modelscs.AI0Open
Towards Knowledgeable Deep Research: Framework and Benchmarkcs.AI0Open
Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolutioncs.AI0Open

๐Ÿข Lab Blog Posts