π΄ High Significance
Developer Tools
π΄ Unified Latents (UL): How to train your latents β score 95
Sources: huggingface
We present Unified Latents (UL), a framework for learning latent representations that are jointly regularized by a diffusion prior and decoded by a diffusion model. By linking the encoder's output noise to the prior's minimum noise level, we obtain a simple training objective that provides a tight u
π΄ Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents β score 85
Sources: huggingface
The paper introduces GUI-Owl-1.5, the latest native GUI agent model that features instruct/thinking variants in multiple sizes (2B/4B/8B/32B/235B) and supports a range of platforms (desktop, mobile, browser, and more) to enable cloud-edge collaboration and real-time interaction. GUI-Owl-1.5 achieves
π΄ SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning β score 75
Sources: huggingface
Many training-free sparse attention methods are effective for accelerating diffusion models. Recently, several works suggest that making sparse attention trainable can further increase sparsity while preserving generation quality. We study three key questions: (1) when do the two common masking rule
π‘ Notable
Model Releases
π‘ Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 β score 65
Sources: huggingface
To understand and identify the unprecedented risks posed by rapidly advancing artificial intelligence (AI) models, Frontier AI Risk Management Framework in Practice presents a comprehensive assessment of their frontier risks. As Large Language Models (LLMs) general capabilities rapidly evolve and th
Developer Tools
π‘ Computer-Using World Model β score 50
Sources: huggingface
Agents operating in complex software environments benefit from reasoning about the consequences of their actions, as even a single incorrect user interface (UI) operation can derail long, artifact-preserving workflows. This challenge is particularly acute for computer-using scenarios, where real exe
Infrastructure & Compute
π‘ Arcee Trinity Large Technical Report β score 50
Sources: huggingface
We present the technical report for Arcee Trinity Large, a sparse Mixture-of-Experts model with 400B total parameters and 13B activated per token. Additionally, we report on Trinity Nano and Trinity Mini, with Trinity Nano having 6B total parameters with 1B activated per token, Trinity Mini having 2
Other Signals
π‘ Our First Proof submissions β score 50
Sources: lab_blog/OpenAI
We share our AI modelβs proof attempts for the First Proof math challenge, testing research-grade reasoning on expert-level problems.
π’ Incremental
Developer Tools
π’ Discovering Multiagent Learning Algorithms with Large Language Models β score 35
Sources: huggingface
Much of the advancement of Multi-Agent Reinforcement Learning (MARL) in imperfect-information games has historically depended on manual iterative refinement of baselines. While foundational families like Counterfactual Regret Minimization (CFR) and Policy Space Response Oracles (PSRO) rest on solid
π’ Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents β score 25
Sources: huggingface
LLMs are increasingly being used for complex problems which are not necessarily resolved in a single response, but require interacting with an environment to acquire information. In these scenarios, LLMs must reason about inherent cost-uncertainty tradeoffs in when to stop exploring and commit to an
π’ "What Are You Doing?": Effects of Intermediate Feedback from Agentic LLM In-Car Assistants During Multi-Step Processing β score 15
Sources: huggingface
Agentic AI assistants that autonomously perform multi-step tasks raise open questions for user experience: how should such systems communicate progress and reasoning during extended operations, especially in attention-critical contexts such as driving? We investigate feedback timing and verbosity fr
π’ DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers β score 5
Sources: huggingface
Diffusion Transformers (DiTs) have achieved state-of-the-art performance in image and video generation, but their success comes at the cost of heavy computation. This inefficiency is largely due to the fixed tokenization process, which uses constant-sized patches throughout the entire denoising phas
π New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| Unified Latents (UL): How to train your latents | developer_tool | 64 | Open |
| Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents | developer_tool | 54 | Open |
| SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning | developer_tool | 50 | Open |
| Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 | model_release | 33 | Open |
| Arcee Trinity Large Technical Report | infrastructure | 22 | Open |
| Computer-Using World Model | developer_tool | 22 | Open |
| Games That Teach, Chats That Convince: Comparing Interactive and Static Formats for Persuasive Learning | cs.AI | 0 | Open |
| Improving Neural Topic Modeling with Semantically-Grounded Soft Label Distributions | cs.AI | 0 | Open |
| Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems | cs.AI | 0 | Open |
| Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering | cs.AI | 0 | Open |
| From Lossy to Verified: A Provenance-Aware Tiered Memory for Agents | cs.AI | 0 | Open |
| MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance | cs.AI | 0 | Open |
| Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning | cs.AI | 0 | Open |
| Causal Neighbourhood Learning for Invariant Graph Representations | cs.AI | 0 | Open |
| Optimizing Graph Causal Classification Models: Estimating Causal Effects and Addressing Confounders | cs.AI | 0 | Open |
π’ Lab Blog Posts
- OpenAI: Our First Proof submissions