๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing โ€” score 75 Sources: huggingface

Current unified multimodal models for image generation and editing typically rely on massive parameter scales (e.g., >10B), entailing prohibitive training costs and deployment footprints. In this work, we present DeepGen 1.0, a lightweight 5B unified model that achieves comprehensive capabilities co

Developer Tools

๐Ÿ”ด The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies โ€” score 95 Sources: huggingface

The emergence of multi-agent systems built from large language models (LLMs) offers a promising paradigm for scalable collective intelligence and self-evolution. Ideally, such systems would achieve continuous self-improvement in a fully closed loop while maintaining robust safety alignment--a combin

๐Ÿ”ด Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models โ€” score 85 Sources: huggingface

Large-scale verifiable prompts underpin the success of Reinforcement Learning with Verifiable Rewards (RLVR), but they contain many uninformative examples and are costly to expand further. Recent studies focus on better exploiting limited training data by prioritizing hard prompts whose rollout pass

๐ŸŸก Notable

Model Releases

๐ŸŸก Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation โ€” score 65 Sources: huggingface

On-policy distillation (OPD), which aligns the student with the teacher's logit distribution on student-generated trajectories, has demonstrated strong empirical gains in improving student performance and often outperforms off-policy distillation and reinforcement learning (RL) paradigms. In this wo

๐ŸŸก GPT-5.2 derives a new result in theoretical physics โ€” score 50 Sources: lab_blog/OpenAI

A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.

๐ŸŸก Introducing Lockdown Mode and Elevated Risk labels in ChatGPT โ€” score 50 Sources: lab_blog/OpenAI

Introducing Lockdown Mode and Elevated Risk labels in ChatGPT to help organizations defend against prompt injection and AI-driven data exfiltration.

๐ŸŸก Scaling social science research โ€” score 50 Sources: lab_blog/OpenAI

GABRIEL is a new open-source toolkit from OpenAI that uses GPT to turn qualitative text and images into quantitative data, helping social scientists analyze research at scale.

Developer Tools

๐ŸŸก GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning โ€” score 55 Sources: huggingface

Vision-language-action (VLA) models that directly predict multi-step action chunks from current observations face inherent limitations due to constrained scene understanding and weak future anticipation capabilities. In contrast, video world models pre-trained on web-scale video corpora exhibit robu

๐ŸŸก Beyond rate limits: scaling access to Codex and Sora โ€” score 50 Sources: lab_blog/OpenAI

How OpenAI built a real-time access system combining rate limits, usage tracking, and credits to power continuous access to Sora and Codex.

๐ŸŸก MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models โ€” score 45 Sources: huggingface

Discrete audio tokenizers are fundamental to empowering large language models with native audio processing and generation capabilities. Despite recent progress, existing approaches often rely on pretrained encoders, semantic distillation, or heterogeneous CNN-based architectures. These designs intro

๐ŸŸข Incremental

Model Releases

๐ŸŸข Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning โ€” score 25 Sources: huggingface

Achieving effective test-time scaling requires models to engage in In-Context Exploration -- the intrinsic ability to generate, verify, and refine multiple reasoning hypotheses within a single continuous context. Grounded in State Coverage theory, our analysis identifies a critical bottleneck to e

Developer Tools

๐ŸŸข NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control โ€” score 35 Sources: huggingface

Synthesizing coherent soundtracks for long-form videos remains a formidable challenge, currently stalled by three critical impediments: computational scalability, temporal coherence, and, most critically, a pervasive semantic blindness to evolving narrative logic. To bridge these gaps, we propose Na

๐ŸŸข LawThinker: A Deep Research Legal Agent in Dynamic Environments โ€” score 15 Sources: huggingface

Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To add

๐ŸŸข Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching โ€” score 5 Sources: huggingface

Visual illusions traditionally rely on spatial manipulations such as multi-view consistency. In this work, we introduce Progressive Semantic Illusions, a novel vector sketching task where a single sketch undergoes a dramatic semantic transformation through the sequential addition of strokes. We pres

๐Ÿ“„ New Papers

TitleCategoryScoreLink
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societiesdeveloper_tool206Open
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Modelsdeveloper_tool95Open
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editingmodel_release87Open
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolationmodel_release67Open
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learningdeveloper_tool64Open
Favia: Forensic Agent for Vulnerability-fix Identification and Analysiscs.AI0Open
Evolving Demonstration Optimization for Chain-of-Thought Feature Transformationcs.AI0Open
Monocular Reconstruction of Neural Tactile Fieldscs.AI0Open
Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Gamescs.AI0Open
Autorubric: Unifying Rubric-based LLM Evaluationcs.AI0Open
Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inferencecs.AI0Open
Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluationcs.AI0Open
Decoder-only Conformer with Modality-aware Sparse Mixtures of Experts for ASRcs.AI0Open
A consequence of failed sequential learning: A computational account of developmental amnesiacs.AI0Open
SD-MoE: Spectral Decomposition for Effective Expert Specializationcs.AI0Open

๐Ÿข Lab Blog Posts