๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration โ€” score 75 Sources: huggingface

In this paper, we uncover the hidden potential of Diffusion Transformers (DiTs) to significantly enhance generative tasks. Through an in-depth analysis of the denoising process, we demonstrate that introducing a single learned scaling parameter can significantly improve the performance of DiT blocks

Developer Tools

๐Ÿ”ด Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale โ€” score 95 Sources: huggingface

We introduce Intern-S1-Pro, the first one-trillion-parameter scientific multimodal foundation model. Scaling to this unprecedented size, the model delivers a comprehensive enhancement across both general and scientific domains. Beyond stronger reasoning and image-text understanding capabilities, its

๐Ÿ”ด PixelSmile: Toward Fine-Grained Facial Expression Editing โ€” score 85 Sources: huggingface

Fine-grained facial expression editing has long been limited by intrinsic semantic overlap. To address this, we construct the Flex Facial Expression (FFE) dataset with continuous affective annotations and establish FFE-Bench to evaluate structural confusion, editing accuracy, linear controllability,

๐ŸŸก Notable

Model Releases

๐ŸŸก Voxtral TTS โ€” score 60 Sources: huggingface

We introduce Voxtral TTS, an expressive multilingual text-to-speech model that generates natural speech from as little as 3 seconds of reference audio. Voxtral TTS adopts a hybrid architecture that combines auto-regressive generation of semantic speech tokens with flow-matching for acoustic tokens.

๐ŸŸก STADLER reshapes knowledge work at a 230-year-old company โ€” score 50 Sources: lab_blog/OpenAI

Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.

Developer Tools

๐ŸŸก RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models โ€” score 60 Sources: huggingface

Image restoration under real-world degradations is critical for downstream tasks such as autonomous driving and object detection. However, existing restoration models are often limited by the scale and distribution of their training data, resulting in poor generalization to real-world scenarios. Rec

๐ŸŸก MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens โ€” score 45 Sources: huggingface

Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the field. Due to the constraints of full-attention architectures, the effective context length of large language models (LLMs) is typically limited to 1M

๐ŸŸข Incremental

Model Releases

๐ŸŸข MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data โ€” score 35 Sources: huggingface

Generating images conditioned on multiple visual references is critical for real-world applications such as multi-subject composition, narrative illustration, and novel view synthesis, yet current models suffer from severe performance degradation as the number of input references grows. We identify

๐ŸŸข AVControl: Efficient Framework for Training Audio-Visual Controls โ€” score 15 Sources: huggingface

Controlling video and audio generation requires diverse modalities, from depth and pose to camera trajectories and audio transformations, yet existing approaches either train a single monolithic model for a fixed set of controls or introduce costly architectural changes for each new modality. We int

๐ŸŸข VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models โ€” score 5 Sources: huggingface

Scalable Vector Graphics (SVG) are an essential format for technical illustration and digital design, offering precise resolution independence and flexible semantic editability. In practice, however, original vector source files are frequently lost or inaccessible, leaving only "flat" rasterized ver

Developer Tools

๐ŸŸข SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks โ€” score 25 Sources: huggingface

Software development is iterative, yet agentic coding benchmarks overwhelmingly evaluate single-shot solutions against complete specifications. Code can pass the test suite but become progressively harder to extend. Recent iterative benchmarks attempt to close this gap, but constrain the agent's des

๐Ÿ“„ New Papers

TitleCategoryScoreLink
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scaledeveloper_tool140Open
PixelSmile: Toward Fine-Grained Facial Expression Editingdeveloper_tool121Open
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibrationmodel_release73Open
Voxtral TTSmodel_release62Open
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Modelsdeveloper_tool62Open
Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovationscs.LG0Open
Clinical Reasoning AI for Oncology Treatment Planning: A Multi-Specialty Case-Based Evaluationcs.LG0Open
QuitoBench: A High-Quality Open Time Series Forecasting Benchmarkcs.LG0Open
Central-to-Local Adaptive Generative Diffusion Framework for Improving Gene Expression Prediction in Data-Limited Spatial Transcriptomicscs.LG0Open
GLU: Global-Local-Uncertainty Fusion for Scalable Spatiotemporal Reconstruction and Forecastingcs.LG0Open
Identification of Bivariate Causal Directionality Based on Anticipated Asymmetric Geometriescs.LG0Open
Constitutive parameterized deep energy method for solid mechanics problems with random material parameterscs.LG0Open
Accelerating PayPal's Commerce Agent with Speculative Decoding: An Empirical Study on EAGLE3 with Fine-Tuned Nemotron Modelscs.LG0Open
H-Node Attack and Defense in Large Language Modelscs.LG0Open
Asymptotic Optimism for Tensor Regression Models with Applications to Neural Network Compressioncs.LG0Open

๐Ÿข Lab Blog Posts