๐ด High Significance
Model Releases
๐ด VLANeXt: Recipes for Building Strong VLA Models โ score 75
Sources: huggingface
Following the rise of large foundation models, Vision-Language-Action models (VLAs) emerged, leveraging strong visual and language understanding for general-purpose policy learning. Yet, the current VLA landscape remains fragmented and exploratory. Although many groups have proposed their own VLA mo
Developer Tools
๐ด A Very Big Video Reasoning Suite โ score 95
Sources: huggingface
Rapid progress in video models has largely focused on visual quality, leaving their reasoning capabilities underexplored. Video reasoning grounds intelligence in spatiotemporally consistent visual environments that go beyond what text can naturally capture, enabling intuitive reasoning over spatiote
๐ด SkillOrchestra: Learning to Route Agents via Skill Transfer โ score 85
Sources: huggingface
Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trai
๐ก Notable
Model Releases
๐ก TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics โ score 45
Sources: huggingface
While Vision-Language-Action (VLA) models have seen rapid progress in pretraining, their advancement in Reinforcement Learning (RL) remains hampered by low sample efficiency and sparse rewards in real-world settings. Developing generalizable process reward models is essential for providing the fine-
Developer Tools
๐ก Agents of Chaos โ score 65
Sources: huggingface
We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under b
๐ก ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation โ score 55
Sources: huggingface
Sequential recommendation increasingly employs latent multi-step reasoning to enhance test-time computation. Despite empirical gains, existing approaches largely drive intermediate reasoning states via target-dominant objectives without imposing explicit feasibility constraints. This results in late
Other Signals
๐ก Arvind KC appointed Chief People Officer โ score 50
Sources: lab_blog/OpenAI
OpenAI appoints Arvind KC as Chief People Officer to help scale the company, strengthen its culture, and lead how work evolves in the age of AI.
๐ข Incremental
Developer Tools
๐ข Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device โ score 35
Sources: huggingface
Unified multimodal models can both understand and generate visual content within a single architecture. Existing models, however, remain data-hungry and too heavy for deployment on edge devices. We present Mobile-O, a compact vision-language-diffusion model that brings unified multimodal intelligenc
๐ข Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction โ score 15
Sources: huggingface
We study the task of establishing object-level visual correspondence across different viewpoints in videos, focusing on the challenging egocentric-to-exocentric and exocentric-to-egocentric scenarios. We propose a simple yet effective framework based on conditional binary segmentation, where an obje
๐ข DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning โ score 5
Sources: huggingface
Reinforcement learning with verifiers (RLVR) is a central paradigm for improving large language model (LLM) reasoning, yet existing methods often suffer from limited exploration. Policies tend to collapse onto a few reasoning patterns and prematurely stop deep exploration, while conventional entropy
Infrastructure & Compute
๐ข SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation โ score 25
Sources: huggingface
The ability to manipulate tools significantly expands the set of tasks a robot can perform. Yet, tool manipulation represents a challenging class of dexterity, requiring grasping thin objects, in-hand object rotations, and forceful interactions. Since collecting teleoperation data for these behavior
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| A Very Big Video Reasoning Suite | developer_tool | 525 | Open |
| SkillOrchestra: Learning to Route Agents via Skill Transfer | developer_tool | 64 | Open |
| VLANeXt: Recipes for Building Strong VLA Models | model_release | 56 | Open |
| Agents of Chaos | developer_tool | 38 | Open |
| ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation | developer_tool | 33 | Open |
| Can AI be a Teaching Partner? Evaluating ChatGPT, Gemini, and DeepSeek across Three Teaching Strategies | cs.AI | 0 | Open |
| Imputation of Unknown Missingness in Sparse Electronic Health Records | cs.AI | 0 | Open |
| Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference | cs.AI | 0 | Open |
| PreScience: A Benchmark for Forecasting Scientific Contributions | cs.AI | 0 | Open |
| ERA: Evidence-based Reliability Alignment for Honest Retrieval-Augmented Generation | cs.AI | 0 | Open |
| Elimination-compensation pruning for fully-connected neural networks | cs.AI | 0 | Open |
| VINA: Variational Invertible Neural Architectures | cs.AI | 0 | Open |
| Hybrid LLM-Embedded Dialogue Agents for Learner Reflection: Designing Responsive and Theory-Driven Interactions | cs.AI | 0 | Open |
| Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA | cs.AI | 0 | Open |
| KairosVL: Orchestrating Time Series and Semantics for Unified Reasoning | cs.AI | 0 | Open |