๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models โ€” score 95 Sources: huggingface

Data-centric training has emerged as a promising direction for improving large language models (LLMs) by optimizing not only model parameters but also the selection, composition, and weighting of training data during optimization. However, existing approaches to data selection, data mixture optimiza

Developer Tools

๐Ÿ”ด The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook โ€” score 85 Sources: huggingface

Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent spa

๐Ÿ”ด Generative World Renderer โ€” score 75 Sources: huggingface

Scaling generative inverse and forward rendering to real-world scenarios is bottlenecked by the limited realism and temporal coherence of existing synthetic datasets. To bridge this persistent domain gap, we introduce a large-scale, dynamic dataset curated from visually complex AAA games. Using a no

๐ŸŸก Notable

Developer Tools

๐ŸŸก SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization โ€” score 65 Sources: huggingface

Agent skills, structured packages of procedural knowledge and executable resources that agents dynamically load at inference time, have become a reliable mechanism for augmenting LLM agents. Yet inference-time skill augmentation is fundamentally limited: retrieval noise introduces irrelevant guidanc

๐ŸŸก VOID: Video Object and Interaction Deletion โ€” score 55 Sources: huggingface

Existing video object removal methods excel at inpainting content "behind" the object and correcting appearance-level artifacts such as shadows and reflections. However, when the removed object has more significant interactions, such as collisions with other objects, current models fail to correct t

๐ŸŸก CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery โ€” score 45 Sources: huggingface

Large language model (LLM)-based evolution is a promising approach for open-ended discovery, where progress requires sustained search and knowledge accumulation. Existing methods still rely heavily on fixed heuristics and hard-coded exploration rules, which limit the autonomy of LLM agents. We prese

๐ŸŸข Incremental

Developer Tools

๐ŸŸข Steerable Visual Representations โ€” score 35 Sources: huggingface

Pretrained Vision Transformers (ViTs) such as DINOv2 and MAE provide generic image features that can be applied to a variety of downstream tasks such as retrieval, classification, and segmentation. However, such representations tend to focus on the most salient visual cues in the image, with no way

๐ŸŸข EgoSim: Egocentric World Simulator for Embodied Interaction Generation โ€” score 25 Sources: huggingface

We introduce EgoSim, a closed-loop egocentric world simulator that generates spatially consistent interaction videos and persistently updates the underlying 3D scene state for continuous simulation. Existing egocentric simulators either lack explicit 3D grounding, causing structural drift under view

๐ŸŸข Therefore I am. I Think โ€” score 15 Sources: huggingface

We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a

๐ŸŸข LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model โ€” score 5 Sources: huggingface

Unified models (UMs) hold promise for their ability to understand and generate content across heterogeneous modalities. Compared to merely generating visual content, the use of UMs for interleaved cross-modal reasoning is more promising and valuable, e.g., for solving understanding problems that req

๐Ÿ“„ New Papers

TitleCategoryScoreLink
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Modelsmodel_release368Open
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlookdeveloper_tool151Open
Generative World Rendererdeveloper_tool106Open
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalizationdeveloper_tool104Open
VOID: Video Object and Interaction Deletiondeveloper_tool59Open
Moondream Segmentation: From Words to Maskscs.AI0Open
Explorable Theorems: Making Written Theorems Explorable by Grounding Them in Formal Representationscs.AI0Open
LitPivot: Developing Well-Situated Research Ideas Through Dynamic Contextualization and Critique within the Literature Landscapecs.AI0Open
AICCE: AI Driven Compliance Checker Enginecs.AI0Open
Do Audio-Visual Large Language Models Really See and Hear?cs.AI0Open
AutoVerifier: An Agentic Automated Verification Framework Using Large Language Modelscs.AI0Open
OntoKG: Ontology-Oriented Knowledge Graph Construction with Intrinsic-Relational Routingcs.AI0Open
Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agentscs.AI0Open
Smart Transfer: Leveraging Vision Foundation Model for Rapid Building Damage Mapping with Post-Earthquake VHR Imagerycs.AI0Open
Toys that listen, talk, and play: Understanding Children's Sensemaking and Interactions with AI Toyscs.AI0Open