๐ด High Significance
Model Releases
๐ด TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents โ score 85
Sources: huggingface
Executing complex terminal tasks remains a significant challenge for open-weight LLMs, constrained by two fundamental limitations. First, high-fidelity, executable training environments are scarce: environments synthesized from real-world repositories are not diverse and scalable, while trajectories
๐ด QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining โ score 75
Sources: huggingface
Financial markets are noisy and non-stationary, making alpha mining highly sensitive to noise in backtesting results and sudden market regime shifts. While recent agentic frameworks improve alpha mining automation, they often lack controllable multi-round search and reliable reuse of validated exper
Developer Tools
๐ด Weak-Driven Learning: How Weak Agents make Strong Agents Stronger โ score 95
Sources: huggingface
As post-training optimization becomes central to improving large language models, we observe a persistent saturation bottleneck: once models grow highly confident, further training yields diminishing returns. While existing methods continue to reinforce target predictions, we find that informative s
๐ก Notable
Model Releases
๐ก MOVA: Towards Scalable and Synchronized Video-Audio Generation โ score 65
Sources: huggingface
Audio is indispensable for real-world video, yet generation models have largely overlooked audio components. Current approaches to producing audio-visual content often rely on cascaded pipelines, which increase cost, accumulate errors, and degrade overall quality. While systems such as Veo 3 and Sor
Developer Tools
๐ก Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models โ score 55
Sources: huggingface
Despite the success of multimodal contrastive learning in aligning visual and linguistic representations, a persistent geometric anomaly, the Modality Gap, remains: embeddings of distinct modalities expressing identical semantics occupy systematically offset regions. Prior approaches to bridge this
๐ก AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents โ score 45
Sources: huggingface
LLM agents hold significant promise for advancing scientific research. To accelerate this progress, we introduce AIRS-Bench (the AI Research Science Benchmark), a suite of 20 tasks sourced from state-of-the-art machine learning papers. These tasks span diverse domains, including language modeling, m
๐ข Incremental
Developer Tools
๐ข InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery โ score 35
Sources: huggingface
We introduce InternAgent-1.5, a unified system designed for end-to-end scientific discovery across computational and empirical domains. The system is built on a structured architecture composed of three coordinated subsystems for generation, verification, and evolution. These subsystems are supporte
๐ข LLaDA2.1: Speeding Up Text Diffusion via Token Editing โ score 25
Sources: huggingface
While LLaDA2.0 showcased the scaling potential of 100B-level block-diffusion models and their inherent parallelization, the delicate equilibrium between decoding speed and generation quality has remained an elusive frontier. Today, we unveil LLaDA2.1, a paradigm shift designed to transcend this trad
๐ข Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning โ score 15
Sources: huggingface
Current Vision-Language-Action (VLA) models rely on fixed computational depth, expending the same amount of compute on simple adjustments and complex multi-step manipulation. While Chain-of-Thought (CoT) prompting enables variable computation, it scales memory linearly and is ill-suited for continuo
๐ข RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI โ score 5
Sources: huggingface
Online policy learning directly in the physical world is a promising yet challenging direction for embodied intelligence. Unlike simulation, real-world systems cannot be arbitrarily accelerated, cheaply reset, or massively replicated, which makes scalable data collection, heterogeneous deployment, a
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| Weak-Driven Learning: How Weak Agents make Strong Agents Stronger | developer_tool | 296 | Open |
| TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents | model_release | 212 | Open |
| QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining | model_release | 193 | Open |
| MOVA: Towards Scalable and Synchronized Video-Audio Generation | model_release | 164 | Open |
| Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models | developer_tool | 148 | Open |
| X-Mark: Saliency-Guided Robust Dataset Ownership Verification for Medical Imaging | cs.AI | 0 | Open |
| Human Control Is the Anchor, Not the Answer: Early Divergence of Oversight in Agentic AI Communities | cs.AI | 0 | Open |
| Empowering Contrastive Federated Sequential Recommendation with LLMs | cs.AI | 0 | Open |
| Don't Shoot The Breeze: Topic Continuity Model Using Nonlinear Naive Bayes With Attention | cs.AI | 0 | Open |
| Clarifying Shampoo: Adapting Spectral Descent to Stochasticity and the Parameter Trajectory | cs.AI | 0 | Open |
| Learning to Select Like Humans: Explainable Active Learning for Medical Imaging | cs.AI | 0 | Open |
| A Deep Multi-Modal Method for Patient Wound Healing Assessment | cs.AI | 0 | Open |
| SnareNet: Flexible Repair Layers for Neural Networks with Hard Constraints | cs.AI | 0 | Open |
| GAFR-Net: A Graph Attention and Fuzzy-Rule Network for Interpretable Breast Cancer Image Classification | cs.AI | 0 | Open |
| Beyond Uniform Credit: Causal Credit Assignment for Policy Optimization | cs.AI | 0 | Open |