๐ด High Significance
Developer Tools
๐ด Utonia: Toward One Encoder for All Point Clouds โ score 95
Sources: huggingface
We dream of a future where point clouds from all domains can come together to shape a single model that benefits them all. Toward this goal, we present Utonia, a first step toward training a single self-supervised point transformer encoder across diverse domains, spanning remote sensing, outdoor LiD
๐ด Beyond Language Modeling: An Exploration of Multimodal Pretraining โ score 85
Sources: huggingface
The visual world offers a critical axis for advancing foundation models beyond language. Despite growing interest in this direction, the design space for native multimodal models remains opaque. We provide empirical clarity through controlled, from-scratch pretraining experiments, isolating the fact
Infrastructure & Compute
๐ด UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? โ score 75
Sources: huggingface
Unified multimodal models have recently demonstrated strong generative capabilities, yet whether and when generation improves understanding remains unclear. Existing benchmarks lack a systematic exploration of the specific tasks where generation facilitates understanding. To this end, we introduce U
๐ก Notable
Model Releases
๐ก Qwen3-Coder-Next Technical Report โ score 65
Sources: huggingface
We present Qwen3-Coder-Next, an open-weight language model specialized for coding agents. Qwen3-Coder-Next is an 80-billion-parameter model that activates only 3 billion parameters during inference, enabling strong coding capability with efficient inference. In this work, we explore how far strong t
๐ก Extending single-minus amplitudes to gravitons โ score 50
Sources: lab_blog/OpenAI
A new preprint extends single-minus amplitudes to gravitons, with GPT-5.2 Pro helping derive and verify nonzero graviton tree amplitudes in quantum gravity.
๐ก Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models โ score 45
Sources: huggingface
Recent advancements in Generative Reward Models (GRMs) have demonstrated that scaling the length of Chain-of-Thought (CoT) reasoning considerably enhances the reliability of evaluation. However, current works predominantly rely on unstructured length scaling, ignoring the divergent efficacy of diffe
Developer Tools
๐ก BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? โ score 55
Sources: huggingface
Current benchmarks for code agents primarily assess narrow, repository-specific fixes, overlooking critical real-world challenges such as cross-repository reasoning, domain-specialized problem solving, dependency-driven migration, and full-repository generation. To address this gap, we introduce Bey
Other Signals
๐ก Understanding AI and learning outcomes โ score 50
Sources: lab_blog/OpenAI
OpenAI introduces the Learning Outcomes Measurement Suite to assess AIโs impact on student learning across diverse educational environments over time.
๐ก How Axios uses AI to help deliver high-impact local journalism โ score 50
Sources: lab_blog/OpenAI
Axios COO Allison Murphy explains how the company uses AI to support local reporters, streamline newsroom workflows, and deliver high-impact local journalism at scale.
๐ข Incremental
Model Releases
๐ข Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance โ score 35
Sources: huggingface
Instruction-based video editing has witnessed rapid progress, yet current methods often struggle with precise visual control, as natural language is inherently limited in describing complex visual nuances. Although reference-guided editing offers a robust solution, its potential is currently bottlen
Developer Tools
๐ข Kling-MotionControl Technical Report โ score 20
Sources: huggingface
Character animation aims to generate lifelike videos by transferring motion dynamics from a driving video to a reference image. Recent strides in generative models have paved the way for high-fidelity character animation. In this work, we present Kling-MotionControl, a unified DiT-based framework en
๐ข How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities โ score 20
Sources: huggingface
Large Language Models (LLMs) are increasingly deployed in socially sensitive domains, yet their unpredictable behaviors, ranging from misaligned intent to inconsistent personality, pose significant risks. We introduce SteerEval, a hierarchical benchmark for evaluating LLM controllability across thre
๐ข Next Embedding Prediction Makes World Models Stronger โ score 5
Sources: huggingface
Capturing temporal dependencies is critical for model-based reinforcement learning (MBRL) in partially observable, high-dimensional domains. We introduce NE-Dreamer, a decoder-free MBRL agent that leverages a temporal transformer to predict next-step encoder embeddings from latent state sequences, d
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| Utonia: Toward One Encoder for All Point Clouds | developer_tool | 189 | Open |
| Beyond Language Modeling: An Exploration of Multimodal Pretraining | developer_tool | 109 | Open |
| UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? | infrastructure | 91 | Open |
| Qwen3-Coder-Next Technical Report | model_release | 69 | Open |
| BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? | developer_tool | 63 | Open |
| RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering | cs.AI | 0 | Open |
| Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study | cs.AI | 0 | Open |
| Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions | cs.AI | 0 | Open |
| Bridging Pedagogy and Play: Introducing a Language Mapping Interface for Human-AI Co-Creation in Educational Game Design | cs.AI | 0 | Open |
| Field imaging framework for morphological characterization of aggregates with computer vision: Algorithms and applications | cs.AI | 0 | Open |
| Mozi: Governed Autonomy for Drug Discovery LLM Agents | cs.AI | 0 | Open |
| InEdit-Bench: Benchmarking Intermediate Logical Pathways for Intelligent Image Editing Models | cs.AI | 0 | Open |
| Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling | cs.AI | 0 | Open |
| Can LLM Aid in Solving Constraints with Inductive Definitions? | cs.AI | 0 | Open |
| Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation | cs.AI | 0 | Open |