๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration โ€” score 95 Sources: huggingface

As high-quality public text approaches exhaustion, a phenomenon known as the Data Wall, pre-training is shifting from more tokens to better tokens. However, existing methods either rely on heuristic static filters that ignore training dynamics, or use dynamic yet optimizer-agnostic criteria based on

๐Ÿ”ด Code2World: A GUI World Model via Renderable Code Generation โ€” score 85 Sources: huggingface

Autonomous GUI agents interact with environments by perceiving interfaces and executing actions. As a virtual sandbox, the GUI World model empowers agents with human-like foresight by enabling action-conditioned prediction. However, existing text- and pixel-based approaches struggle to simultaneousl

Developer Tools

๐Ÿ”ด UI-Venus-1.5 Technical Report โ€” score 75 Sources: huggingface

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world

๐ŸŸก Notable

Model Releases

๐ŸŸก Chain of Mindset: Reasoning with Adaptive Cognitive Modes โ€” score 55 Sources: huggingface

Human problem-solving is never the repetition of a single mindset, by which we mean a distinct mode of cognitive processing. When tackling a specific task, we do not rely on a single mindset; instead, we integrate multiple mindsets within the single solution process. However, existing LLM reasoning

๐ŸŸก P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads โ€” score 45 Sources: huggingface

The transition from symbolic manipulation to science-grade reasoning represents a pivotal frontier for Large Language Models (LLMs), with physics serving as the critical test anchor for binding abstract logic to physical reality. Physics demands that a model maintain physical consistency with the la

Developer Tools

๐ŸŸก SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning โ€” score 65 Sources: huggingface

Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting

๐ŸŸก Harness engineering: leveraging Codex in an agent-first world โ€” score 50 Sources: lab_blog/OpenAI

By Ryan Lopopolo, Member of the Technical Staff

๐ŸŸข Incremental

Developer Tools

๐ŸŸข Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning โ€” score 35 Sources: huggingface

Recent advances in large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments. However, scaling such agent training is limited by the lack of diverse and reliable environments. In this paper, we propose Agent

๐ŸŸข Prism: Spectral-Aware Block-Sparse Attention โ€” score 25 Sources: huggingface

Block-sparse attention is promising for accelerating long-context LLM pre-filling, yet identifying relevant blocks efficiently remains a bottleneck. Existing methods typically employ coarse-grained attention as a proxy for block importance estimation, but often resort to expensive token-level search

๐ŸŸข Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling โ€” score 15 Sources: huggingface

We study instruction-based image editing under professional workflows and identify three persistent challenges: (i) editors often over-edit, modifying content beyond the user's intent; (ii) existing models are largely single-turn, while multi-turn edits can alter object faithfulness; and (iii) evalu

๐ŸŸข DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents โ€” score 5 Sources: huggingface

Recently, Diffusion Large Language Models (dLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm. Meanwhile, despite the rapid advancement of Search Agents, their practical deployment is constrained by a fundam

๐Ÿ“„ New Papers

TitleCategoryScoreLink
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iterationmodel_release355Open
Code2World: A GUI World Model via Renderable Code Generationmodel_release205Open
UI-Venus-1.5 Technical Reportdeveloper_tool161Open
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learningdeveloper_tool78Open
Chain of Mindset: Reasoning with Adaptive Cognitive Modesmodel_release77Open
The Alignment Bottleneck in Decomposition-Based Claim Verificationcs.AI0Open
Capture Timing-Attention of Events in Clinical Time Seriescs.AI0Open
Making Databases Faster with LLM Evolutionary Samplingcs.AI0Open
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMscs.AI0Open
Affordances Enable Partial World Modeling with LLMscs.AI0Open
Modular Multi-Task Learning for Chemical Reaction Predictioncs.AI0Open
AI-rithmeticcs.AI0Open
Equivariant Evidential Deep Learning for Interatomic Potentialscs.AI0Open
AIvilization v0: Toward Large-Scale Artificial Social Simulation with a Unified Agent Architecture and Adaptive Agent Profilescs.AI0Open
Breaking the Curse of Repulsion: Optimistic Distributionally Robust Policy Optimization for Off-Policy Generative Recommendationcs.AI0Open

๐Ÿข Lab Blog Posts