π΄ High Significance
Model Releases
π΄ Near-Future Policy Optimization β score 85
Sources: huggingface
Reinforcement learning with verifiable rewards (RLVR) has become a core post-training recipe. Introducing suitable off-policy trajectories into on-policy exploration accelerates RLVR convergence and raises the performance ceiling, yet finding a source of such trajectories remains the key challenge.
π΄ DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data β score 75
Sources: huggingface
Edge-scale deep research agents based on small language models are attractive for real-world deployment due to their advantages in cost, latency, and privacy. In this work, we study how to train a strong small deep research agent under limited open-data by improving both data quality and data utiliz
Developer Tools
π΄ LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model β score 95
Sources: huggingface
We present LLaDA2.0-Uni, a unified discrete diffusion large language model (dLLM) that supports multimodal understanding and generation within a natively integrated framework. Its architecture combines a fully semantic discrete tokenizer, a MoE-based dLLM backbone, and a diffusion decoder. By discre
π‘ Notable
Model Releases
π‘ OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis β score 55
Sources: huggingface
Mobile agents powered by vision-language models have demonstrated impressive capabilities in automating mobile tasks, with recent leading models achieving a marked performance leap, e.g., nearly 70% success on AndroidWorld. However, these systems keep their training data closed and remain opaque abo
π‘ Introducing GPT-5.5 β score 50
Sources: lab_blog/OpenAI
Introducing GPT-5.5, our smartest model yetβfaster, more capable, and built for complex tasks like coding, research, and data analysis across tools.
π‘ GPT-5.5 System Card β score 50
Sources: lab_blog/OpenAI
π‘ GPT-5.5 Bio Bug Bounty β score 50
Sources: lab_blog/OpenAI
Explore the GPT-5.5 Bio Bug Bounty: a red-teaming challenge to find universal jailbreaks for bio safety risks, with rewards up to $25,000.
Developer Tools
π‘ Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges β score 65
Sources: huggingface
Reinforcement Learning from Human Feedback (RLHF) and related alignment paradigms have become central to steering large language models (LLMs) and multimodal large language models (MLLMs) toward human-preferred behaviors. However, these approaches introduce a systemic vulnerability: reward hacking,
π‘ What is Codex? β score 50
Sources: lab_blog/OpenAI
Learn how Codex helps you go beyond chat by automating tasks, connecting tools, and producing real outputs like docs and dashboards.
π‘ How to get started with Codex β score 50
Sources: lab_blog/OpenAI
Learn how to get started with Codex by setting up projects, creating threads, and completing your first tasks with step-by-step guidance.
π‘ Codex settings β score 50
Sources: lab_blog/OpenAI
Learn how to configure Codex settings, including personalization, detail level, and permissions, to run tasks smoothly and customize your workflow.
π‘ Working with Codex β score 50
Sources: lab_blog/OpenAI
Learn how to set up your Codex workspace, create threads and projects, manage files, and start completing tasks with step-by-step guidance.
Omitted 4 additional developer tools items from the main section; see raw data and source-specific sections below.
Business & Funding
π‘ From buildingApplied IntuitionfromYC-eraautonomy tooling into a$15B physical AI company,Qasar YounisandPeter Ludwighave spent the last decade living through the full arc of autonomy: fromsimulationand β score 65
Sources: newsletter/Latent Space
From buildingApplied IntuitionfromYC-eraautonomy tooling into a$15B physical AI company,Qasar YounisandPeter Ludwighave spent the last decade living through the full arc of autonomy: fromsimulationanddata infrastructurefor robotaxi companies, to operating systems for safety-critical machines, to dep
Other Signals
π‘ Applied Intuitionβs mission:building physical AI for a safer, more prosperous world, powering cars, trucks, construction and mining equipment, agriculture, defense, and other moving machines β score 65
Sources: newsletter/Latent Space
Why physical AI is different from screen-based AI:learned systems can make mistakes in chat or coding, but safety-critical machines like driverless trucks, autonomous vehicles, and robots need much higher reliability
π’ Incremental
Developer Tools
π’ Exploring Spatial Intelligence from a Generative Perspective β score 35
Sources: huggingface
Spatial intelligence is essential for multimodal large language models, yet current benchmarks largely assess it only from an understanding perspective. We ask whether modern generative or unified multimodal models also possess generative spatial intelligence (GSI), the ability to respect and manipu
π’ A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression β score 20
Sources: huggingface
As model capabilities advance, research has increasingly shifted toward long-horizon, multi-turn terminal-centric agentic tasks, where raw environment feedback is often preserved in the interaction history to support future decisions. However, repeatedly retaining such feedback introduces substantia
π’ Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts β score 20
Sources: huggingface
Mixture-of-Experts (MoE) has become the dominant architecture for scaling large language models: frontier models routinely decouple total parameters from per-token computation through sparse expert routing. Scaling laws show that under fixed active computation, model quality scales predictably with
π’ Image Generators are Generalist Vision Learners β score 5
Sources: huggingface
Recent works show that image and video generators exhibit zero-shot visual understanding behaviors, in a way reminiscent of how LLMs develop emergent capabilities of language understanding and reasoning from generative pretraining. While it has long been conjectured that the ability to create visual
π New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model | developer_tool | 240 | Open |
| Near-Future Policy Optimization | model_release | 76 | Open |
| DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data | model_release | 54 | Open |
| Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges | developer_tool | 34 | Open |
| OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis | model_release | 30 | Open |
| Feedback Over Form: Why Execution Feedback Matters More Than Pipeline Topology in 1-3B Code Generation | cs.AI | 0 | Open |
| TAPO-Description Logic for Information Behavior: Refined OBoxes, Inference, and Categorical Semantics | cs.AI | 0 | Open |
| Scaling of Gaussian Kolmogorov--Arnold Networks | cs.AI | 0 | Open |
| Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery | cs.AI | 0 | Open |
| How VLAs (Really) Work In Open-World Environments | cs.AI | 0 | Open |
| Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models | cs.AI | 0 | Open |
| On Reasoning Behind Next Occupation Recommendation | cs.AI | 0 | Open |
| IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review | cs.AI | 0 | Open |
| Align Generative Artificial Intelligence with Human Preferences: A Novel Large Language Model Fine-Tuning Method for Online Review Management | cs.AI | 0 | Open |
| A Demonstration of SQLyzr: A Platform for Fine-Grained Text-to-SQL Evaluation and Analysis | cs.AI | 0 | Open |
π’ Lab Blog Posts
- OpenAI: Introducing GPT-5.5
- OpenAI: GPT-5.5 System Card
- OpenAI: What is Codex?
- OpenAI: How to get started with Codex
- OpenAI: Codex settings
- OpenAI: Working with Codex
- OpenAI: Plugins and skills
- OpenAI: Top 10 uses for Codex at work
- OpenAI: Automations
- OpenAI: GPT-5.5 Bio Bug Bounty
π° Newsletter Roundup
Items surfaced by newsletter editors that were not merged with primary sources:
- newsletter/Latent Space: From buildingApplied IntuitionfromYC-eraautonomy tooling into a$15B physical AI company,Qasar YounisandPeter Ludwighave spent the last decade living through the full arc of autonomy: fromsimulationand
- newsletter/Latent Space: Applied Intuitionβs mission:building physical AI for a safer, more prosperous world, powering cars, trucks, construction and mining equipment, agriculture, defense, and other moving machines