๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text โ€” score 85 Sources: huggingface

Reinforcement Learning with Verifiable Rewards (RLVR) has become a cornerstone for unlocking complex reasoning in Large Language Models (LLMs). Yet, scaling up RL is bottlenecked by limited existing verifiable data, where improvements increasingly saturate over prolonged training. To overcome this,

๐Ÿ”ด ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas โ€” score 70 Sources: huggingface

Large language models (LLMs) are increasingly used as tool-augmented agents for multi-step decision making, yet training robust tool-using agents remains challenging. Existing methods still require manual intervention, depend on non-verifiable simulated environments, rely exclusively on either super

Developer Tools

๐Ÿ”ด PaperBanana: Automating Academic Illustration for AI Scientists โ€” score 95 Sources: huggingface

Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready a

๐Ÿ”ด Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation โ€” score 70 Sources: huggingface

The NVFP4 lower-precision format, supported in hardware by NVIDIA Blackwell GPUs, promises to allow, for the first time, end-to-end fully-quantized pre-training of massive models such as LLMs. Yet, existing quantized training methods still sacrifice some of the representation capacity of this format

๐ŸŸก Notable

Model Releases

๐ŸŸก THINKSAFE: Self-Generated Safety Alignment for Reasoning Models โ€” score 55 Sources: huggingface

Large reasoning models (LRMs) achieve remarkable performance by leveraging reinforcement learning (RL) on reasoning tasks to generate long chain-of-thought (CoT) reasoning. However, this over-optimization often prioritizes compliance, making models vulnerable to harmful prompts. To mitigate this saf

๐ŸŸก Introducing the Codex app โ€” score 50 Sources: lab_blog/OpenAI

Introducing the Codex app for macOSโ€”a command center for AI coding and software development with multiple agents, parallel workflows, and long-running tasks.

Developer Tools

๐ŸŸก Snowflake and OpenAI partner to bring frontier intelligence to enterprise data โ€” score 50 Sources: lab_blog/OpenAI

OpenAI and Snowflake partner in a $200M agreement to bring frontier intelligence into enterprise data, enabling AI agents and insights directly in Snowflake.

๐ŸŸก ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought โ€” score 40 Sources: huggingface

While Chain-of-Thought (CoT) significantly enhances the performance of Large Language Models (LLMs), explicit reasoning chains introduce substantial computational redundancy. Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space, but often suff

๐ŸŸก TTCS: Test-Time Curriculum Synthesis for Self-Evolving โ€” score 40 Sources: huggingface

Test-Time Training offers a promising way to improve the reasoning ability of large language models (LLMs) by adapting the model using only the test questions. However, existing methods struggle with difficult reasoning problems for two reasons: raw test questions are often too difficult to yield hi

๐ŸŸข Incremental

Developer Tools

๐ŸŸข Causal World Modeling for Robot Control โ€” score 25 Sources: huggingface

This work highlights that video world modeling, alongside vision-language pre-training, establishes a fresh and independent foundation for robot learning. Intuitively, video world models provide the ability to imagine the near future by understanding the causality between actions and visual dynamics

๐ŸŸข Do Reasoning Models Enhance Embedding Models? โ€” score 15 Sources: huggingface

State-of-the-art embedding models are increasingly derived from decoder-only Large Language Model (LLM) backbones adapted via contrastive learning. Given the emergence of reasoning models trained via Reinforcement Learning with Verifiable Rewards (RLVR), a natural question arises: do enhanced reason

๐ŸŸข MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning โ€” score 5 Sources: huggingface

Long-horizon agentic reasoning necessitates effectively compressing growing interaction histories into a limited context window. Most existing memory systems serialize history as text, where token-level cost is uniform and scales linearly with length, often spending scarce budget on low-value detail

๐Ÿ“„ New Papers

TitleCategoryScoreLink
PaperBanana: Automating Academic Illustration for AI Scientistsdeveloper_tool240Open
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Textmodel_release117Open
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimationdeveloper_tool65Open
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenasmodel_release65Open
THINKSAFE: Self-Generated Safety Alignment for Reasoning Modelsmodel_release42Open
OpInf-LLM: Parametric PDE Solving with LLMs via Operator Inferencecs.AI0Open
Draw2Learn: A Human-AI Collaborative Tool for Drawing-Based Science Learningcs.AI0Open
Governance at the Edge of Architecture: Regulating NeuroAI and Neuromorphic Systemscs.AI0Open
Harnessing Flexible Spatial and Temporal Data Center Workloads for Grid Regulation Servicescs.AI0Open
MarkCleaner: High-Fidelity Watermark Removal via Imperceptible Micro-Geometric Perturbationcs.AI0Open
White-Box Neural Ensemble for Vehicular Plasticity: Quantifying the Efficiency Cost of Symbolic Auditability in Adaptive NMPCcs.AI0Open
Qrita: High-performance Top-k and Top-p Algorithm for GPUs using Pivot-based Truncation and Selectioncs.AI0Open
You Need an Encoder for Native Position-Independent Cachingcs.AI0Open
A Relative-Budget Theory for Reinforcement Learning with Verifiable Rewards in Large Language Model Reasoningcs.AI0Open
Toward a Machine Bertin: Why Visualization Needs Design Principles for Machine Cognitioncs.AI0Open

๐Ÿข Lab Blog Posts