๐ด High Significance
Model Releases
๐ด OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration โ score 95
Sources: huggingface
As high-quality public text approaches exhaustion, a phenomenon known as the Data Wall, pre-training is shifting from more tokens to better tokens. However, existing methods either rely on heuristic static filters that ignore training dynamics, or use dynamic yet optimizer-agnostic criteria based on
๐ด Code2World: A GUI World Model via Renderable Code Generation โ score 85
Sources: huggingface
Autonomous GUI agents interact with environments by perceiving interfaces and executing actions. As a virtual sandbox, the GUI World model empowers agents with human-like foresight by enabling action-conditioned prediction. However, existing text- and pixel-based approaches struggle to simultaneousl
Developer Tools
๐ด UI-Venus-1.5 Technical Report โ score 75
Sources: huggingface
GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world
๐ก Notable
Model Releases
๐ก Chain of Mindset: Reasoning with Adaptive Cognitive Modes โ score 55
Sources: huggingface
Human problem-solving is never the repetition of a single mindset, by which we mean a distinct mode of cognitive processing. When tackling a specific task, we do not rely on a single mindset; instead, we integrate multiple mindsets within the single solution process. However, existing LLM reasoning
๐ก P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads โ score 45
Sources: huggingface
The transition from symbolic manipulation to science-grade reasoning represents a pivotal frontier for Large Language Models (LLMs), with physics serving as the critical test anchor for binding abstract logic to physical reality. Physics demands that a model maintain physical consistency with the la
Developer Tools
๐ก SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning โ score 65
Sources: huggingface
Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting
๐ก Harness engineering: leveraging Codex in an agent-first world โ score 50
Sources: lab_blog/OpenAI
By Ryan Lopopolo, Member of the Technical Staff
๐ข Incremental
Developer Tools
๐ข Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning โ score 35
Sources: huggingface
Recent advances in large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments. However, scaling such agent training is limited by the lack of diverse and reliable environments. In this paper, we propose Agent
๐ข Prism: Spectral-Aware Block-Sparse Attention โ score 25
Sources: huggingface
Block-sparse attention is promising for accelerating long-context LLM pre-filling, yet identifying relevant blocks efficiently remains a bottleneck. Existing methods typically employ coarse-grained attention as a proxy for block importance estimation, but often resort to expensive token-level search
๐ข Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling โ score 15
Sources: huggingface
We study instruction-based image editing under professional workflows and identify three persistent challenges: (i) editors often over-edit, modifying content beyond the user's intent; (ii) existing models are largely single-turn, while multi-turn edits can alter object faithfulness; and (iii) evalu
๐ข DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents โ score 5
Sources: huggingface
Recently, Diffusion Large Language Models (dLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm. Meanwhile, despite the rapid advancement of Search Agents, their practical deployment is constrained by a fundam
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration | model_release | 355 | Open |
| Code2World: A GUI World Model via Renderable Code Generation | model_release | 205 | Open |
| UI-Venus-1.5 Technical Report | developer_tool | 161 | Open |
| SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning | developer_tool | 78 | Open |
| Chain of Mindset: Reasoning with Adaptive Cognitive Modes | model_release | 77 | Open |
| The Alignment Bottleneck in Decomposition-Based Claim Verification | cs.AI | 0 | Open |
| Capture Timing-Attention of Events in Clinical Time Series | cs.AI | 0 | Open |
| Making Databases Faster with LLM Evolutionary Sampling | cs.AI | 0 | Open |
| Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs | cs.AI | 0 | Open |
| Affordances Enable Partial World Modeling with LLMs | cs.AI | 0 | Open |
| Modular Multi-Task Learning for Chemical Reaction Prediction | cs.AI | 0 | Open |
| AI-rithmetic | cs.AI | 0 | Open |
| Equivariant Evidential Deep Learning for Interatomic Potentials | cs.AI | 0 | Open |
| AIvilization v0: Toward Large-Scale Artificial Social Simulation with a Unified Agent Architecture and Adaptive Agent Profiles | cs.AI | 0 | Open |
| Breaking the Curse of Repulsion: Optimistic Distributionally Robust Policy Optimization for Off-Policy Generative Recommendation | cs.AI | 0 | Open |