π΄ High Significance
Developer Tools
π΄ ERNIE 5.0 Technical Report β score 95
Sources: huggingface
In this report, we introduce ERNIE 5.0, a natively autoregressive foundation model desinged for unified multimodal understanding and generation across text, image, video, and audio. All modalities are trained from scratch under a unified next-group-of-tokens prediction objective, based on an ultra-s
π΄ FASA: Frequency-aware Sparse Attention β score 85
Sources: huggingface
The deployment of Large Language Models (LLMs) faces a critical bottleneck when handling lengthy inputs: the prohibitive memory footprint of the Key Value (KV) cache. To address this bottleneck, the token pruning paradigm leverages attention sparsity to selectively retain a small, critical subset of
π΄ WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning β score 75
Sources: huggingface
Recent advancements in Large Language Models (LLMs) have largely focused on depth scaling, where a single agent solves long-horizon problems with multi-turn reasoning and tool use. However, as tasks grow broader, the key bottleneck shifts from individual competence to organizational capability. In t
π‘ Notable
Model Releases
π‘ Training Data Efficiency in Multimodal Process Reward Models β score 65
Sources: huggingface
Multimodal Process Reward Models (MPRMs) are central to step-level supervision for visual reasoning in MLLMs. Training MPRMs typically requires large-scale Monte Carlo (MC)-annotated corpora, incurring substantial training cost. This paper studies the data efficiency for MPRM training.Our preliminar
π‘ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models β score 55
Sources: huggingface
Omni-modal Large Language Models (Omni-LLMs) have demonstrated strong capabilities in audio-video understanding tasks. However, their reliance on long multimodal token sequences leads to substantial computational overhead. Despite this challenge, token compression methods designed for Omni-LLMs rema
π‘ GPT-5 lowers the cost of cell-free protein synthesis β score 50
Sources: lab_blog/OpenAI
An autonomous lab combining OpenAIβs GPT-5 with Ginkgo Bioworksβ cloud automation cut cell-free protein synthesis costs by 40% through closed-loop experimentation.
π‘ Introducing Trusted Access for Cyber β score 50
Sources: lab_blog/OpenAI
OpenAI introduces Trusted Access for Cyber, a trust-based framework that expands access to frontier cyber capabilities while strengthening safeguards against misuse.
π‘ Introducing OpenAI Frontier β score 50
Sources: lab_blog/OpenAI
OpenAI Frontier is an enterprise platform for building, deploying, and managing AI agents with shared context, onboarding, permissions, and governance.
Omitted 3 additional model releases items from the main section; see raw data and source-specific sections below.
π’ Incremental
Model Releases
π’ EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models β score 35
Sources: huggingface
Deploying humanoid robots in real-world settings is fundamentally challenging, as it demands tight integration of perception, locomotion, and manipulation under partial-information observations and dynamically changing environments. As well as transitioning robustly between sub-tasks of different ty
π’ Rethinking the Trust Region in LLM Reinforcement Learning β score 25
Sources: huggingface
Reinforcement learning (RL) has become a cornerstone for fine-tuning Large Language Models (LLMs), with Proximal Policy Optimization (PPO) serving as the de facto standard algorithm. Despite its ubiquity, we argue that the core ratio clipping mechanism in PPO is structurally ill-suited for the large
Developer Tools
π’ Residual Context Diffusion Language Models β score 15
Sources: huggingface
Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to purely autoregressive language models because they can decode multiple tokens in parallel. However, state-of-the-art block-wise dLLMs rely on a "remasking" mechanism that decodes only the most confident tokens and dis
π’ TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents β score 5
Sources: huggingface
Recent advances in autonomous LLM agents demonstrate their ability to improve performance through iterative interaction with the environment. We define this paradigm as Test-Time Improvement (TTI). However, the mechanisms under how and why TTI succeed or fail remain poorly understood, and existing e
π New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| ERNIE 5.0 Technical Report | developer_tool | 274 | Open |
| FASA: Frequency-aware Sparse Attention | developer_tool | 166 | Open |
| WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning | developer_tool | 103 | Open |
| Training Data Efficiency in Multimodal Process Reward Models | model_release | 82 | Open |
| OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models | model_release | 53 | Open |
| TIDE: Temporal Incremental Draft Engine for Self-Improving LLM Inference | cs.AI | 0 | Open |
| Cross-talk based multi-task learning for fault classification of machine system influenced by multiple variables | cs.AI | 0 | Open |
| CoSA: Compressed Sensing-Based Adaptation of Large Language Models | cs.AI | 0 | Open |
| Position: Capability Control Should be a Separate Goal From Alignment | cs.AI | 0 | Open |
| EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization | cs.AI | 0 | Open |
| TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks | cs.AI | 0 | Open |
| Total Variation Rates for Riemannian Flow Matching | cs.AI | 0 | Open |
| Benchmarking Artificial Intelligence Models for Daily Coastal Hypoxia Forecasting | cs.AI | 0 | Open |
| Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning | cs.AI | 0 | Open |
| Towards Worst-Case Guarantees with Scale-Aware Interpretability | cs.AI | 0 | Open |
π’ Lab Blog Posts
- OpenAI: GPT-5 lowers the cost of cell-free protein synthesis
- OpenAI: Introducing Trusted Access for Cyber
- OpenAI: Introducing OpenAI Frontier
- OpenAI: Introducing GPT-5.3-Codex
- OpenAI: GPT-5.3-Codex System Card