๐ด High Significance
Model Releases
๐ด FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization โ score 95
Sources: huggingface
We present Future-KL Influenced Policy Optimization (FIPO), a reinforcement learning algorithm designed to overcome reasoning bottlenecks in large language models. While GRPO style training scales effectively, it typically relies on outcome-based rewards (ORM) that distribute a global advantage unif
๐ด CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence โ score 85
Sources: huggingface
The convergence of low-altitude economies, embodied intelligence, and air-ground cooperative systems creates growing demand for simulation infrastructure capable of jointly modeling aerial and ground agents within a single physically coherent environment. Existing open-source platforms remain domain
Developer Tools
๐ด LongCat-Next: Lexicalizing Modalities as Discrete Tokens โ score 75
Sources: huggingface
The prevailing Next-Token Prediction (NTP) paradigm has driven the success of large language models through discrete autoregressive modeling. However, contemporary multimodal systems remain language-centric, often treating non-linguistic modalities as external attachments, leading to fragmented arch
๐ก Notable
Model Releases
๐ก GEMS: Agent-Native Multimodal Generation with Memory and Skills โ score 65
Sources: huggingface
Recent multimodal generation models have achieved remarkable progress on general-purpose generation tasks, yet continue to struggle with complex instructions and specialized downstream tasks. Inspired by the success of advanced agent frameworks such as Claude Code, we propose GEMS (Agent-Native Mult
๐ก Gradient Labs gives every bank customer an AI account manager โ score 50
Sources: lab_blog/OpenAI
Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.
๐ก Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development โ score 45
Sources: huggingface
Foundation models have demonstrated remarkable success across diverse domains and tasks, primarily due to the thrive of large-scale, diverse, and high-quality datasets. However, in the field of medical imaging, the curation and assembling of such medical datasets are highly challenging due to the re
Developer Tools
๐ก Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells โ score 55
Sources: huggingface
Modeling cellular states and predicting their responses to perturbations are central challenges in computational biology and the development of virtual cells. Existing foundation models for single-cell transcriptomics provide powerful static representations, but they do not explicitly model the dist
๐ข Incremental
Developer Tools
๐ข All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models โ score 35
Sources: huggingface
Recent studies have demonstrated that Reinforcement Learning (RL), notably Group Relative Policy Optimization (GRPO), can intrinsically elicit and enhance the reasoning capabilities of Vision-Language Models (VLMs). However, despite the promise, the underlying mechanisms that drive the effectiveness
๐ข VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward โ score 25
Sources: huggingface
Large-scale video diffusion models achieve impressive visual quality, yet often fail to preserve geometric consistency. Prior approaches improve consistency either by augmenting the generator with additional modules or applying geometry-aware alignment. However, architectural modifications can compr
๐ข CutClaw: Agentic Hours-Long Video Editing via Music Synchronization โ score 15
Sources: huggingface
Editing the video content with audio alignment forms a digital human-made art in current social media. However, the time-consuming and repetitive nature of manual video editing has long been a challenge for filmmakers and professional content creators alike. In this paper, we introduce CutClaw, an a
๐ข Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis โ score 5
Sources: huggingface
Unified multimodal models provide a natural and promising architecture for understanding diverse and complex real-world knowledge while generating high-quality images. However, they still rely primarily on frozen parametric knowledge, which makes them struggle with real-world image generation involv
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization | model_release | 356 | Open |
| CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence | model_release | 344 | Open |
| LongCat-Next: Lexicalizing Modalities as Discrete Tokens | developer_tool | 150 | Open |
| GEMS: Agent-Native Multimodal Generation with Memory and Skills | model_release | 89 | Open |
| Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells | developer_tool | 84 | Open |
| Go Big or Go Home: Simulating Mobbing Behavior with Braitenbergian Robots | cs.AI | 0 | Open |
| Signals: Trajectory Sampling and Triage for Agentic Interactions | cs.AI | 0 | Open |
| In harmony with gpt-oss | cs.AI | 0 | Open |
| Grading the Unspoken: Evaluating Tacit Reasoning in Quantum Field Theory and String Theory with LLMs | cs.AI | 0 | Open |
| RAGShield: Detecting Numerical Claim Manipulation in Government RAG Systems | cs.AI | 0 | Open |
| EvolveTool-Bench: Evaluating the Quality of LLM-Generated Tool Libraries as Software Artifacts | cs.AI | 0 | Open |
| Deep Networks Favor Simple Data | cs.AI | 0 | Open |
| Improving Generalization of Deep Learning for Brain Metastases Segmentation Across Institutions | cs.AI | 0 | Open |
| COTTA: Context-Aware Transfer Adaptation for Trajectory Prediction in Autonomous Driving | cs.AI | 0 | Open |
| Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model | cs.AI | 0 | Open |