๐ด High Significance
Model Releases
๐ด DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing โ score 75
Sources: huggingface
Current unified multimodal models for image generation and editing typically rely on massive parameter scales (e.g., >10B), entailing prohibitive training costs and deployment footprints. In this work, we present DeepGen 1.0, a lightweight 5B unified model that achieves comprehensive capabilities co
Developer Tools
๐ด The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies โ score 95
Sources: huggingface
The emergence of multi-agent systems built from large language models (LLMs) offers a promising paradigm for scalable collective intelligence and self-evolution. Ideally, such systems would achieve continuous self-improvement in a fully closed loop while maintaining robust safety alignment--a combin
๐ด Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models โ score 85
Sources: huggingface
Large-scale verifiable prompts underpin the success of Reinforcement Learning with Verifiable Rewards (RLVR), but they contain many uninformative examples and are costly to expand further. Recent studies focus on better exploiting limited training data by prioritizing hard prompts whose rollout pass
๐ก Notable
Model Releases
๐ก Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation โ score 65
Sources: huggingface
On-policy distillation (OPD), which aligns the student with the teacher's logit distribution on student-generated trajectories, has demonstrated strong empirical gains in improving student performance and often outperforms off-policy distillation and reinforcement learning (RL) paradigms. In this wo
๐ก GPT-5.2 derives a new result in theoretical physics โ score 50
Sources: lab_blog/OpenAI
A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.
๐ก Introducing Lockdown Mode and Elevated Risk labels in ChatGPT โ score 50
Sources: lab_blog/OpenAI
Introducing Lockdown Mode and Elevated Risk labels in ChatGPT to help organizations defend against prompt injection and AI-driven data exfiltration.
๐ก Scaling social science research โ score 50
Sources: lab_blog/OpenAI
GABRIEL is a new open-source toolkit from OpenAI that uses GPT to turn qualitative text and images into quantitative data, helping social scientists analyze research at scale.
Developer Tools
๐ก GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning โ score 55
Sources: huggingface
Vision-language-action (VLA) models that directly predict multi-step action chunks from current observations face inherent limitations due to constrained scene understanding and weak future anticipation capabilities. In contrast, video world models pre-trained on web-scale video corpora exhibit robu
๐ก Beyond rate limits: scaling access to Codex and Sora โ score 50
Sources: lab_blog/OpenAI
How OpenAI built a real-time access system combining rate limits, usage tracking, and credits to power continuous access to Sora and Codex.
๐ก MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models โ score 45
Sources: huggingface
Discrete audio tokenizers are fundamental to empowering large language models with native audio processing and generation capabilities. Despite recent progress, existing approaches often rely on pretrained encoders, semantic distillation, or heterogeneous CNN-based architectures. These designs intro
๐ข Incremental
Model Releases
๐ข Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning โ score 25
Sources: huggingface
Achieving effective test-time scaling requires models to engage in In-Context Exploration -- the intrinsic ability to generate, verify, and refine multiple reasoning hypotheses within a single continuous context. Grounded in State Coverage theory, our analysis identifies a critical bottleneck to e
Developer Tools
๐ข NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control โ score 35
Sources: huggingface
Synthesizing coherent soundtracks for long-form videos remains a formidable challenge, currently stalled by three critical impediments: computational scalability, temporal coherence, and, most critically, a pervasive semantic blindness to evolving narrative logic. To bridge these gaps, we propose Na
๐ข LawThinker: A Deep Research Legal Agent in Dynamic Environments โ score 15
Sources: huggingface
Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To add
๐ข Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching โ score 5
Sources: huggingface
Visual illusions traditionally rely on spatial manipulations such as multi-view consistency. In this work, we introduce Progressive Semantic Illusions, a novel vector sketching task where a single sketch undergoes a dramatic semantic transformation through the sequential addition of strokes. We pres
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies | developer_tool | 206 | Open |
| Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models | developer_tool | 95 | Open |
| DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing | model_release | 87 | Open |
| Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation | model_release | 67 | Open |
| GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning | developer_tool | 64 | Open |
| Favia: Forensic Agent for Vulnerability-fix Identification and Analysis | cs.AI | 0 | Open |
| Evolving Demonstration Optimization for Chain-of-Thought Feature Transformation | cs.AI | 0 | Open |
| Monocular Reconstruction of Neural Tactile Fields | cs.AI | 0 | Open |
| Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games | cs.AI | 0 | Open |
| Autorubric: Unifying Rubric-based LLM Evaluation | cs.AI | 0 | Open |
| Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference | cs.AI | 0 | Open |
| Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluation | cs.AI | 0 | Open |
| Decoder-only Conformer with Modality-aware Sparse Mixtures of Experts for ASR | cs.AI | 0 | Open |
| A consequence of failed sequential learning: A computational account of developmental amnesia | cs.AI | 0 | Open |
| SD-MoE: Spectral Decomposition for Effective Expert Specialization | cs.AI | 0 | Open |