๐ด High Significance
Model Releases
๐ด MARS: Enabling Autoregressive Models Multi-Token Generation โ score 75
Sources: huggingface
Autoregressive (AR) language models generate text one token at a time, even when consecutive tokens are highly predictable given earlier context. We introduce MARS (Mask AutoRegreSsion), a lightweight fine-tuning method that teaches an instruction-tuned AR model to predict multiple tokens per forwar
Developer Tools
๐ด Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning โ score 95
Sources: huggingface
Humans paint images incrementally: they plan a global layout, sketch a coarse draft, inspect, and refine details, and most importantly, each step is grounded in the evolving visual states. However, can unified multimodal models trained on text-image interleaved datasets also imagine the chain of int
๐ด RAGEN-2: Reasoning Collapse in Agentic RL โ score 85
Sources: huggingface
RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot tell whether reasoning actually responds to differe
๐ก Notable
Model Releases
๐ก CyberAgent moves faster with ChatGPT Enterprise and Codex โ score 50
Sources: lab_blog/OpenAI
CyberAgent uses ChatGPT Enterprise and Codex to securely scale AI adoption, improve quality, and accelerate decisions across advertising, media, and gaming.
Developer Tools
๐ก INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling โ score 65
Sources: huggingface
Building world models with spatial consistency and real-time interactivity remains a fundamental challenge in computer vision. Current video generation paradigms often struggle with a lack of spatial persistence and insufficient visual realism, making it difficult to support seamless navigation in c
๐ก FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling โ score 55
Sources: huggingface
Reinforcement-Learning-based post-training has recently emerged as a promising paradigm for aligning text-to-image diffusion models with human preferences. In recent studies, increasing the rollout group size yields pronounced performance improvements, indicating substantial room for further alignme
๐ก Combee: Scaling Prompt Learning for Self-Improving Language Model Agents โ score 40
Sources: huggingface
Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these me
๐ก SEVerA: Verified Synthesis of Self-Evolving Agents โ score 40
Sources: huggingface
Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. Howeve
Other Signals
๐ก OpenAI Full Fan Mode Contest: Terms & Conditions โ score 50
Sources: lab_blog/OpenAI
Explore the official terms and conditions for the OpenAI Full Fan Mode Contest, including eligibility, entry steps, judging criteria, and prize details. Learn how to participate, submit your entry on Instagram, and win IPL match tickets.
๐ข Incremental
Model Releases
๐ข Qualixar OS: A Universal Operating System for AI Agent Orchestration โ score 15
Sources: huggingface
We present Qualixar OS, the first application-layer operating system for universal AI agent orchestration. Unlike kernel-level approaches (AIOS) or single-framework tools (AutoGen, CrewAI), Qualixar OS provides a complete runtime for heterogeneous multi-agent systems spanning 10 LLM providers, 8+ ag
Developer Tools
๐ข Neural Computers โ score 25
Sources: huggingface
We propose a new frontier: Neural Computers (NCs) -- an emerging machine form that unifies computation, memory, and I/O in a learned runtime state. Unlike conventional computers, which execute explicit programs, agents, which act over external execution environments, and world models, which learn en
๐ข TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders โ score 5
Sources: huggingface
We propose TC-AE, a ViT-based architecture for deep compression autoencoders. Existing methods commonly increase the channel number of latent representations to maintain reconstruction quality under high compression ratios. However, this strategy often leads to latent representation collapse, which
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning | developer_tool | 75 | Open |
| RAGEN-2: Reasoning Collapse in Agentic RL | developer_tool | 70 | Open |
| MARS: Enabling Autoregressive Models Multi-Token Generation | model_release | 41 | Open |
| INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling | developer_tool | 38 | Open |
| FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling | developer_tool | 35 | Open |
| An Imperfect Verifier is Good Enough: Learning with Noisy Rewards | cs.AI | 0 | Open |
| From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation | cs.AI | 0 | Open |
| Reinforcement Learning with LLM-Guided Action Spaces for Synthesizable Lead Optimization | cs.AI | 0 | Open |
| Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class System | cs.AI | 0 | Open |
| Joint Task Offloading, Inference Optimization and UAV Trajectory Planning for Generative AI Empowered Intelligent Transportation Digital Twin | cs.AI | 0 | Open |
| AITH: A Post-Quantum Continuous Delegation Protocol for Human-AI Trust Establishment | cs.AI | 0 | Open |
| IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures | cs.AI | 0 | Open |
| Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models | cs.AI | 0 | Open |
| Towards Knowledgeable Deep Research: Framework and Benchmark | cs.AI | 0 | Open |
| Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution | cs.AI | 0 | Open |