๐ด High Significance
Developer Tools
๐ด The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping โ score 95
Sources: huggingface
Despite the success of reinforcement learning for large language models, a common failure mode is reduced sampling diversity, where the policy repeatedly generates similar erroneous behaviors. Classical entropy regularization encourages randomness under the current policy, but does not explicitly di
๐ด QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation โ score 85
Sources: huggingface
Large Language Models (LLMs) are increasingly used for code generation, yet quantum code generation is still evaluated mostly within single frameworks, making it difficult to separate quantum reasoning from framework familiarity. We introduce QuanBench+, a unified benchmark spanning Qiskit, PennyLan
๐ด Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation โ score 75
Sources: huggingface
As the foundational architecture of modern machine learning, Transformers have driven remarkable progress across diverse AI domains. Despite their transformative impact, a persistent challenge across various Transformers is Attention Sink (AS), in which a disproportionate amount of attention is focu
๐ก Notable
Model Releases
๐ก Trusted access for the next era of cyber defense โ score 50
Sources: lab_blog/OpenAI
OpenAI expands its Trusted Access for Cyber program, introducing GPT-5.4-Cyber to vetted defenders and strengthening safeguards as AI cybersecurity capabilities advance.
Developer Tools
๐ก OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation โ score 65
Sources: huggingface
In this work, we study Human-Object Interaction Video Generation (HOIVG), which aims to synthesize high-quality human-object interaction videos conditioned on text, reference images, audio, and pose. This task holds significant practical value for automating content creation in real-world applicatio
๐ก Strips as Tokens: Artist Mesh Generation with Native UV Segmentation โ score 55
Sources: huggingface
Recent advancements in autoregressive transformers have demonstrated remarkable potential for generating artist-quality meshes. However, the token ordering strategies employed by existing methods typically fail to meet professional artist standards, where coordinate-based sorting yields inefficientl
๐ก Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator โ score 45
Sources: huggingface
Unified multimodal models integrating visual understanding and generation face a fundamental challenge: visual generation incurs substantially higher computational costs than understanding, particularly for video. This imbalance motivates us to invert the conventional paradigm: rather than extending
๐ข Incremental
Model Releases
๐ข Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing โ score 35
Sources: huggingface
Recent progress in multimodal models has spurred rapid advances in audio understanding, generation, and editing. However, these capabilities are typically addressed by specialized models, leaving the development of a truly unified framework that can seamlessly integrate all three tasks underexplored
Developer Tools
๐ข Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models โ score 25
Sources: huggingface
Unified multimodal models (UMMs) were designed to combine the reasoning ability of large language models (LLMs) with the generation capability of vision models. In practice, however, this synergy remains elusive: UMMs fail to transfer LLM-like reasoning to image synthesis and exhibit divergent respo
๐ข CodeTracer: Towards Traceable Agent States โ score 15
Sources: huggingface
Code agents are advancing rapidly, but debugging them is becoming increasingly difficult. As frameworks orchestrate parallel tool calls and multi-stage workflows over complex tasks, making the agent's state transitions and error propagation hard to observe. In these runs, an early misstep can trap t
๐ข CocoaBench: Evaluating Unified Digital Agents in the Wild โ score 5
Sources: huggingface
LLM agents now perform strongly in software engineering, deep research, GUI automation, and various other applications, while recent agent scaffolds and models are increasingly integrating these capabilities into unified systems. Yet, most evaluations still test these capabilities in isolation, whic
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping | developer_tool | 142 | Open |
| QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation | developer_tool | 130 | Open |
| Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation | developer_tool | 82 | Open |
| OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation | developer_tool | 73 | Open |
| Strips as Tokens: Artist Mesh Generation with Native UV Segmentation | developer_tool | 58 | Open |
| Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution | cs.AI | 0 | Open |
| Development, Evaluation, and Deployment of a Multi-Agent System for Thoracic Tumor Board | cs.AI | 0 | Open |
| EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture | cs.AI | 0 | Open |
| Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference | cs.AI | 0 | Open |
| Evaluating Relational Reasoning in LLMs with REL | cs.AI | 0 | Open |
| Policy-Invisible Violations in LLM-Based Agents | cs.AI | 0 | Open |
| CycloneMAE: A Scalable Multi-Task Learning Model for Global Tropical Cyclone Probabilistic Forecasting | cs.AI | 0 | Open |
| Clustering-Enhanced Domain Adaptation for Cross-Domain Intrusion Detection in Industrial Control Systems | cs.AI | 0 | Open |
| TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning | cs.AI | 0 | Open |
| Characterizing Resource Sharing Practices on Underground Internet Forum Synthetic Non-Consensual Intimate Image Content Creation Communities | cs.AI | 0 | Open |