๐Ÿ”ด High Significance

Developer Tools

๐Ÿ”ด Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning โ€” score 95 Sources: huggingface

Large language models (LLMs) typically receive diverse natural language (NL) feedback through interaction with the environment. However, current reinforcement learning (RL) algorithms rely solely on scalar rewards, leaving the rich information in NL feedback underutilized and leading to inefficient

๐Ÿ”ด OpenClaw-RL: Train Any Agent Simply by Talking โ€” score 85 Sources: huggingface

Every agent interaction generates a next-state signal, namely the user reply, tool output, terminal or GUI state change that follows each action, yet no existing agentic RL system recovers it as a live, online learning source. We present OpenClaw-RL, a framework built on a simple observation: next-s

Infrastructure & Compute

๐Ÿ”ด Flash-KMeans: Fast and Memory-Efficient Exact K-Means โ€” score 75 Sources: huggingface

k-means has historically been positioned primarily as an offline processing primitive, typically used for dataset organization or embedding preprocessing rather than as a first-class component in online systems. In this work, we revisit this classical algorithm under the lens of modern AI system des

๐ŸŸก Notable

Developer Tools

๐ŸŸก LLM2Vec-Gen: Generative Embeddings from Large Language Models โ€” score 65 Sources: huggingface

LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed by training embedding models with paired data using contrastive learning. In this work, we propose a no

๐ŸŸก In-Context Reinforcement Learning for Tool Use in Large Language Models โ€” score 55 Sources: huggingface

While large language models (LLMs) exhibit strong reasoning abilities, their performance on complex tasks is often constrained by the limitations of their internal knowledge. A compelling approach to overcome this challenge is to augment these models with external tools -- such as Python interpreter

๐ŸŸก MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents โ€” score 45 Sources: huggingface

As embodied models become powerful, humans will collaborate with multiple embodied AI agents at their workplace or home in the future. To ensure better communication between human users and the multi-agent system, it is crucial to interpret incoming information from agents in parallel and refer to t

๐ŸŸข Incremental

Model Releases

๐ŸŸข ID-LoRA: Identity-Driven Audio-Video Personalization with In-Context LoRA โ€” score 25 Sources: huggingface

Existing video personalization methods preserve visual likeness but treat video and audio separately. Without access to the visual scene, audio models cannot synchronize sounds with on-screen actions; and because classical voice-cloning models condition only on a reference recording, a text prompt c

๐ŸŸข SVG-EAR: Parameter-Free Linear Compensation for Sparse Video Generation via Error-aware Routing โ€” score 5 Sources: huggingface

Diffusion Transformers (DiTs) have become a leading backbone for video generation, yet their quadratic attention cost remains a major bottleneck. Sparse attention reduces this cost by computing only a subset of attention blocks. However, prior methods often either drop the remaining blocks, which in

Developer Tools

๐ŸŸข ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning โ€” score 35 Sources: huggingface

Low-rank adapters (LoRAs) are a parameter-efficient finetuning technique that injects trainable low-rank matrices into pretrained models to adapt them to new tasks. Mixture-of-LoRAs models expand neural networks efficiently by routing each layer input to a small subset of specialized LoRAs of the la

๐ŸŸข Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams โ€” score 15 Sources: huggingface

LLMs operating in dynamic real-world contexts often encounter knowledge that evolves continuously or emerges incrementally. To remain accurate and effective, models must adapt to newly arriving information on the fly. We introduce Online Adaptation to Continual Knowledge Streams(OAKS) to evaluate th

๐Ÿ“„ New Papers

TitleCategoryScoreLink
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learningdeveloper_tool214Open
OpenClaw-RL: Train Any Agent Simply by Talkingdeveloper_tool160Open
Flash-KMeans: Fast and Memory-Efficient Exact K-Meansinfrastructure85Open
LLM2Vec-Gen: Generative Embeddings from Large Language Modelsdeveloper_tool47Open
In-Context Reinforcement Learning for Tool Use in Large Language Modelsdeveloper_tool46Open
Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignmentcs.AI0Open
Agentic AI for Embodied-enhanced Beam Prediction in Low-Altitude Economy Networkscs.AI0Open
Stop Listening to Me! How Multi-turn Conversations Can Degrade LLM Diagnostic Reasoningcs.AI0Open
ARROW: Augmented Replay for RObust World modelscs.AI0Open
Efficient Cross-View Localization in 6G Space-Air-Ground Integrated Networkcs.AI0Open
Entropy Guided Diversification and Preference Elicitation in Agentic Recommendation Systemscs.AI0Open
Deployment-Time Reliability of Learned Robot Policiescs.AI0Open
Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialoguecs.AI0Open
Contextual Graph Representations for Task-Driven 3D Perception and Planningcs.AI0Open
Evaluation format, not model capability, drives triage failure in the assessment of consumer health AIcs.AI0Open