🔴 High Significance
Developer Tools
🔴 From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company — score 95
Sources: huggingface
Individual agent capabilities have advanced rapidly through modular skills and tool integrations, yet multi-agent systems remain constrained by fixed team structures, tightly coupled coordination logic, and session-bound learning. We argue that this reflects a deeper absence: a principled organisati
🔴 World-R1: Reinforcing 3D Constraints for Text-to-Video Generation — score 85
Sources: huggingface
Recent video foundation models demonstrate impressive visual synthesis but frequently suffer from geometric inconsistencies. While existing methods attempt to inject 3D priors via architectural modifications, they often incur high computational costs and limit scalability. We propose World-R1, a fra
Infrastructure & Compute
🔴 Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation — score 75
Sources: huggingface
Unified multimodal models typically rely on pretrained vision encoders and use separate visual representations for understanding and generation, creating misalignment between the two tasks and preventing fully end-to-end optimization from raw pixels. We introduce Tuna-2, a native unified multimodal
🟡 Notable
Model Releases
🟡 ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning — score 65
Sources: huggingface
Current evaluations of spatial intelligence can be systematically invalid under modern vision-language model (VLM) settings. First, many benchmarks derive question-answer (QA) pairs from point-cloud-based 3D annotations originally curated for traditional 3D perception. When such annotations are trea
🟡 Apr 28, 2026 Announcements Claude for Creative Work — score 50
Sources: lab_blog/Anthropic
Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office Apr 24, 2026 Announcements An update on our election safeguards Apr 24, 2026 Announcements Anthropic and NEC collaborate to build Japan’s largest AI engineering wo
🟡 Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office — score 50
Sources: lab_blog/Anthropic
Apr 28, 2026 Announcements Claude for Creative Work Apr 24, 2026 Announcements An update on our election safeguards Apr 24, 2026 Announcements Anthropic and NEC collaborate to build Japan’s largest AI engineering workforce Apr 20, 2026 Announcements Anthropic and Amazon expand collaboration for up
🟡 Apr 24, 2026 Announcements An update on our election safeguards — score 50
Sources: lab_blog/Anthropic
Apr 28, 2026 Announcements Claude for Creative Work Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office Apr 24, 2026 Announcements Anthropic and NEC collaborate to build Japan’s largest AI engineering workforce Apr
🟡 Apr 24, 2026 Announcements Anthropic and NEC collaborate to build Japan’s largest AI engineering workforce — score 50
Sources: lab_blog/Anthropic
Apr 28, 2026 Announcements Claude for Creative Work Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office Apr 24, 2026 Announcements An update on our election safeguards Apr 20, 2026 Announcements Anthropic and Amazo
Omitted 10 additional model releases items from the main section; see raw data and source-specific sections below.
Developer Tools
🟡 Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms — score 55
Sources: huggingface
Vision-Language-Action (VLA) models are emerging as a unified substrate for embodied intelligence. This shift raises a new class of safety challenges, stemming from the embodied nature of VLA systems, including irreversible physical consequences, a multimodal attack surface across vision, language,
🟢 Incremental
Developer Tools
🟢 Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis — score 10
Sources: huggingface
Process Reward Models (PRMs) have achieved remarkable success in augmenting the reasoning capabilities of Large Language Models (LLMs) within static domains such as mathematics. However, their potential in dynamic data analysis tasks remains underexplored. In this work, we first present a empirical
🟢 Sapiens2 — score 10
Sources: huggingface
We present Sapiens2, a model family of high-resolution transformers for human-centric vision focused on generalization, versatility, and high-fidelity outputs. Our model sizes range from 0.4 to 5 billion parameters, with native 1K resolution and hierarchical variants that support 4K. Sapiens2 substa
Infrastructure & Compute
🟢 Why Fine-Tuning Encourages Hallucinations and How to Fix It — score 25
Sources: huggingface
Large language models are prone to hallucinating factually incorrect statements. A key source of these errors is exposure to new factual information through supervised fine-tuning (SFT), which can increase hallucinations w.r.t. knowledge acquired during pre-training. In this work, we explore whether
📄 New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company | developer_tool | 121 | Open |
| World-R1: Reinforcing 3D Constraints for Text-to-Video Generation | developer_tool | 117 | Open |
| Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation | infrastructure | 70 | Open |
| ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning | model_release | 65 | Open |
| Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms | developer_tool | 46 | Open |
| Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective | cs.AI | 0 | Open |
| Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization | cs.AI | 0 | Open |
| Optimally Auditing Adversarial Agents | cs.AI | 0 | Open |
| Cooperate to Compete: Strategic Coordination in Multi-Agent Conquest | cs.AI | 0 | Open |
| Doing More With Less: Revisiting the Effectiveness of LLM Pruning for Test-Time Scaling | cs.AI | 0 | Open |
| Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills | cs.AI | 0 | Open |
| Knowledge Distillation Must Account for What It Loses | cs.AI | 0 | Open |
| M$^3$-VQA: A Benchmark for Multimodal, Multi-Entity, Multi-Hop Visual Question Answering | cs.AI | 0 | Open |
| Towards Unified Multi-task EEG Analysis with Low-Rank Adaptation | cs.AI | 0 | Open |
| Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment | cs.AI | 0 | Open |
🏢 Lab Blog Posts
- Anthropic: Apr 28, 2026 Announcements Claude for Creative Work
- Anthropic: Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office
- Anthropic: Apr 24, 2026 Announcements An update on our election safeguards
- Anthropic: Apr 24, 2026 Announcements Anthropic and NEC collaborate to build Japan’s largest AI engineering workforce
- Anthropic: Apr 20, 2026 Announcements Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute
- Anthropic: Apr 17, 2026 Product Introducing Claude Design by Anthropic Labs
- Anthropic: Apr 16, 2026 Product Introducing Claude Opus 4.7
- Anthropic: Apr 14, 2026 Announcements Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors
- Anthropic: Apr 6, 2026 Announcements Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute
- Anthropic: Mar 31, 2026 Announcements Australian government and Anthropic sign MOU for AI safety and research
- OpenAI: Our commitment to community safety
- OpenAI: OpenAI models, Codex, and Managed Agents come to AWS