πŸ”΄ High Significance

Model Releases

πŸ”΄ White House Considers Vetting A.I. Models Before They Are Released β€” score 81 Sources: reddit/r/LocalLLaMA

πŸ”΄ Peanut - Text to Image Model (Open Weights coming soon) β€” score 73 Sources: reddit/r/LocalLLaMA

A new anonymous model debuts at #8 in the Artificial Analysis Text to Image Arena! Peanut’s weights are expected to be released soon, which would make it the leading Text to Image Open Weights Model.

Peanut is positioned to be the new leading open weights Text to Image model, surpassing Z-I

Developer Tools

πŸ”΄ ruvnet/ruflo β€” 🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration β€” score 99 Sources: github_trending

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration

πŸ”΄ TauricResearch/TradingAgents β€” TradingAgents: Multi-Agents LLM Financial Trading Framework β€” score 97 Sources: github_trending

TradingAgents: Multi-Agents LLM Financial Trading Framework

πŸ”΄ Hmbown/DeepSeek-TUI β€” Coding agent for DeepSeek models that runs in your terminal β€” score 95 Sources: github_trending

Coding agent for DeepSeek models that runs in your terminal

πŸ”΄ Are modern ML PhDs becoming too incremental, or is this just what research looks like now? [D] β€” score 94 Sources: reddit/r/MachineLearning

I’ve been thinking about the current state of machine learning PhDs, including my own work, and I’d like to hear how others see it.
My impression is that a large fraction of modern ML PhD work follows a fairly predictable pattern: take an existing idea, connect it to another existing idea, apply i

πŸ”΄ My nerd lil brother surpassed me in his monthly income last month! β€” score 94 Sources: reddit/r/AIAgents

He's kinda a nerd yeah. I was teaching him some basics in AI since he turned 16 but fast forward he found out he can make money off this. I replied "Im sure you do", jokingly because I didn't think he actually is able to find some clients. Hes now 17 and thanks to an automation he built on n8n, he g

Omitted 12 additional developer tools items from the main section; see raw data and source-specific sections below.

Research Papers

πŸ”΄ MolmoAct2: Action Reasoning Models for Real-world Deployment β€” score 95 Sources: huggingface

Vision-Language-Action (VLA) models aim to provide a single generalist controller for robots, but today's systems fall short on the criteria that matter for real-world deployment. Frontier models are closed, open-weight alternatives are tied to expensive hardware, reasoning-augmented policies pay pr

πŸ”΄ Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs β€” score 78 Sources: huggingface Β· arxiv/cs.AI

While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function, causing visual attention to decay inversely with gene

πŸ”΄ Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling β€” score 72 Sources: huggingface Β· arxiv/cs.AI

Recent research has shown that filtering massive English web corpora into high-quality subsets significantly improves training efficiency. However, for high-resource non-English languages like German, French, or Japanese, aggressive filtering creates a strategic dilemma: should practitioners priorit

Other Signals

πŸ”΄ it's time to update your Gemma 4 GGUFs β€” score 88 Sources: reddit/r/LocalLLaMA

Chat Template was fixed a few days ago

choose your fav dealer:

https://huggingface.co/bartowski/google_gemma-4-31B-it-GGUF

[https://huggingface.co/bartowski/google_gemma-4-26B-A4B-it-GGUF](https://huggingface.co/bartowski/google_gem

πŸ”΄ How OpenAI delivers low-latency voice AI at scale β€” score 88 Sources: hackernews

🟑 Notable

Model Releases

🟑 FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 β€” score 65 Sources: reddit/r/LocalLLaMA

Last year researchers affiliated with NVIDIA, University of Warsaw, and University of Edinburgh published Dynamic Memory Sparsification (DMS), a KV-cache sparsification technique using learned per-head token eviction, reporting up to 8x KV-cache compression.

I fo

🟑 Claude Code structure that didn’t break after 2–3 real projects β€” score 61 Sources: reddit/r/AIAgents

Been iterating on my Claude Code setup for a while. Most examples online worked… until things got slightly complex. This is the first structure that held up once I added multiple skills, MCP servers, and agents.

What actually made a difference:

  • **If you’re skipping CLAUDE MD, that’s probably the

🟑 **[@xai: Two voices. One human. One AI. Can you guess the AI clone? πŸ‘‡

Voice cloning, rich with natural emotion, is now live on the Grok Voice API.

http://x.ai/news/grok-custom-voices](https://x.com/xai/status/2051438210065322244)** β€” score 60 Sources: twitter_rss

Two voices. One human. One AI. Can you guess the AI clone? πŸ‘‡

Voice cloning, rich with natural emotion, is now live on the Grok Voice API.

http://x.ai/news/grok-custom-voices

🟑 **[@xai: Voice Cloning is now live via the xAI API!

Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, vide](https://x.com/xai/status/2050355373052223585)** β€” score 60 Sources: twitter_rss

Voice Cloning is now live via the xAI API!

Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, video game characters, and more.

http://x.ai/news/grok-custom-voices

🟑 **[@MistralAI: πŸ†• Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI.

🌎 Enterprise teams have capable models. What they don't have is a way to run them reliably in prod](https://x.com/MistralAI/status/2049128071874179091)** β€” score 60 Sources: twitter_rss

πŸ†• Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI.

🌎 Enterprise teams have capable models. What they don't have is a way to run them reliably in production. That's the gap Workflows fills. It takes AI-powered business processes from prototype to pro

Omitted 2 additional model releases items from the main section; see raw data and source-specific sections below.

Developer Tools

🟑 LearningCircuit/local-deep-research β€” ~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted. β€” score 68 Sources: github_trending

~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.

🟑 cocoindex-io/cocoindex β€” Incremental engine for long horizon agents 🌟 Star if you like it! β€” score 66 Sources: github_trending

Incremental engine for long horizon agents 🌟 Star if you like it!

🟑 Agent Skills β€” score 62 Sources: hackernews

🟑 mnfst/manifest β€” Smart Model Routing for Agents. Cut Costs up to 70% 🦚 β€” score 60 Sources: github_trending

Smart Model Routing for Agents. Cut Costs up to 70% 🦚

🟑 How do you experiment with a (very) large model architecture? [D] β€” score 56 Sources: reddit/r/MachineLearning

Im trying to reproduce a paper (a very particular kind of diffusion model), and their training regime is incredibly compute heavy.

In general, how are quick experiments performed to validate hypotheses when the models are large and compute is expensive?

Some cursory browsing yields the following:

Omitted 6 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

🟑 Lightricks/LTX-2 β€” Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model. β€” score 53 Sources: github_trending

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Research Papers

🟑 OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models β€” score 68 Sources: huggingface Β· arxiv/cs.CL

The vast and underexplored ocean plays a critical role in regulating global climate and supporting marine biodiversity, yet artificial intelligence has so far delivered limited impact in this domain due to a fundamental data bottleneck. Specifically, ocean data are highly fragmented across disparate

🟑 Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey β€” score 60 Sources: arxiv/cs.AI Β· arxiv/cs.LG

arXiv:2411.17429v2 Announce Type: replace-cross Abstract: Graph Neural Networks are powerful models for learning from graph-structured data, yet their effectiveness is often limited by two critical challenges: over-squashing, where information from distant nodes is excessively compressed, and over-

🟑 Representation in large language models β€” score 60 Sources: arxiv/cs.AI Β· arxiv/cs.LG

arXiv:2501.00885v2 Announce Type: replace-cross Abstract: The extraordinary success of recent Large Language Models (LLMs) on a diverse array of tasks has led to an explosion of scientific and philosophical theorizing aimed at explaining how they do what they do. Unfortunately, disagreement over fu

🟑 AcademiClaw: When Students Set Challenges for AI Agents β€” score 55 Sources: huggingface

Benchmarks within the OpenClaw ecosystem have thus far evaluated exclusively assistant-level tasks, leaving the academic-level capabilities of OpenClaw largely unexamined. We introduce AcademiClaw, a bilingual benchmark of 80 complex, long-horizon tasks sourced directly from university students' rea

🟑 Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation β€” score 48 Sources: huggingface Β· arxiv/cs.AI

Retrieval-augmented generation (RAG) enhances large language models with external knowledge, and tree-based RAG organizes documents into hierarchical indexes to support queries at multiple granularities. However, existing Tree-RAG methods designed for single-document retrieval face critical challeng

Other Signals

🟑 **[@sama: pretty excited for voice models to get great

its interesting to watch how people are already starting to change the way they interface with AI](https://x.com/sama/status/2051464865634742334)** β€” score 50 Sources: twitter_rss

pretty excited for voice models to get great

its interesting to watch how people are already starting to change the way they interface with AI

🟑 DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark β€” 10 weeks later, ~17Γ— cheaper β€” score 42 Sources: reddit/r/LocalLLaMA

Tested DeepSeek V4 Pro on FoodTruck Bench β€” our 30-day agentic benchmark where models run a food truck via 34 tools (locations, pricing, inventory, staff, weather, events) with persistent memory and daily reflection.

First Chinese model to land in the frontier tier on our benchmark. Tied with Grok

🟒 Incremental

Model Releases

🟒 As MTP prepares to land in llama.cpp, Models that support MTP β€” score 19 Sources: reddit/r/LocalLLaMA

DeepSeekv3 OG

DeepSeekv3.2/4

Qwen3.5

GLM4.5+

MiniMax2.5+

Step3.5Flash

Mimo v2+

Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.

Developer Tools

🟒 openclaw/acpx β€” Headless CLI client for stateful Agent Client Protocol (ACP) sessions β€” score 36 Sources: github_trending

Headless CLI client for stateful Agent Client Protocol (ACP) sessions

🟒 danielmiessler/Personal_AI_Infrastructure β€” Agentic AI Infrastructure for magnifying HUMAN capabilities. β€” score 29 Sources: github_trending

Agentic AI Infrastructure for magnifying HUMAN capabilities.

🟒 vibevoice.cpp: Microsoft VibeVoice (TTS + long-form ASR with diarization) ported to ggml/C++, runs on CPU/CUDA/Metal/Vulkan, no Python at inference β€” score 27 Sources: reddit/r/LocalLLaMA

A few weeks ago I shipped vibevoice.cpp, a pure-C++ ggml port of Microsoft
VibeVoice (the speech-to-speech model with voice cloning, https://github.com/microsoft/VibeVoice). Wanted to post a follow-up here because we're at a point where the engine has gro

🟒 xingkongliang/skills-manager β€” A lightweight desktop app to manage, sync, and organize AI agent skills across 15+ coding tools β€” Cursor, Claude Code, Codex, Copilot, and more. β€” score 27 Sources: github_trending

A lightweight desktop app to manage, sync, and organize AI agent skills across 15+ coding tools β€” Cursor, Claude Code, Codex, Copilot, and more.

🟒 My electric bill doubled running local models β€” score 17 Sources: reddit/r/AIAgents

Been running my side project's social presence usingΒ Mangos.aiΒ - it's a no code agent builder for product distribution.

I run these agents 24/7 and they do a phenomenal job. They have access to my browser with all my socials logged in. Every few hours they go into X, Threads,

Omitted 6 additional developer tools items from the main section; see raw data and source-specific sections below.

Research Papers

🟒 Code World Model Preparedness Report β€” score 25 Sources: huggingface

This report documents the preparedness assessment of Code World Model (CWM), a model for code generation and reasoning about code from Meta. We conducted pre-release testing across domains identified in our Frontier AI Framework as potentially presenting catastrophic risks, and also evaluated the mo

🟒 Motion-Aware Caching for Efficient Autoregressive Video Generation β€” score 25 Sources: huggingface

Autoregressive video generation paradigms offer theoretical promise for long video synthesis, yet their practical deployment is hindered by the computational burden of sequential iterative denoising. While cache reuse strategies can accelerate generation by skipping redundant denoising steps, existi

🟒 Generative Modeling with Orbit-Space Particle Flow Matching β€” score 25 Sources: huggingface

We present Orbit-Space Geometric Probability Paths (OGPP), a particle-native flow-matching framework for generative modeling of particle systems. OGPP is motivated by two insights: (i) particles are defined up to permutation symmetries, so anonymous indexing inflates per-index target variance and yi

🟒 Perceptual Flow Network for Visually Grounded Reasoning β€” score 25 Sources: huggingface

Despite the success of Large-Vision Language Models (LVLMs), general optimization objectives (e.g., standard MLE) fail to constrain visual trajectories, leading to language bias and hallucination. To mitigate this, current methods introduce geometric priors from visual experts as additional supervis

Other Signals

🟒 Is there a notable increase in demand for privacy-preserving AI/ML with the advent of LLMs? [D] β€” score 39 Sources: reddit/r/MachineLearning

While browsing through this subreddit, I encountered this old discussion post about demand for AI with the rise of privacy regulation. It got me thinking that, 6 years on, the demand for AI

🟒 Train Your Own LLM from Scratch β€” score 38 Sources: hackernews

🟒 MTPLX | 2.24x faster TPS | The native MTP inference engine for Apple Silicon β€” score 35 Sources: reddit/r/LocalLLaMA

TLDR: 28 tok/s β†’ 63 tok/s on Qwen3.6-27B on a MacBook Pro M5 Max. 2.24Γ— faster at real temperature 0.6.

Works for coding, creative writing, and chat

https://i.redd.it/i9x794c0q7zg1.gif

  • Works on ANY MTP model: No external drafter. No extra memory usage. Uses the model's own built-in MTP he

🟒 [D] What Happened to Neurips Creative AI Track? [R] β€” score 31 Sources: reddit/r/MachineLearning

At Neurips 2025, the Creative AI Track was announced as part of the official proceedings:

https://neurips.cc/Conferences/2025/CallForCreativeAI

"Please note that this year the Creative AI track will be part of the NeurIPS conference

🟒 Building a 9-ball AI player: Candidate generation for direct cut shots [P] β€” score 24 Sources: reddit/r/MachineLearning

I'm building a 9-ball-player to help with pattern play. There are many ways to make the next ball, and sometimes in more than one obvious pocket. Which should should you choose depends on probability of making that shot AND ending up in a favorable spot for the next shot, that is also amenable to

Omitted 4 additional other signals items from the main section; see raw data and source-specific sections below.

RepoDescriptionStars TodayLanguage
ruvnet/ruflo🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration2598typescript
TauricResearch/TradingAgentsTradingAgents: Multi-Agents LLM Financial Trading Framework2182python
Hmbown/DeepSeek-TUICoding agent for DeepSeek models that runs in your terminal1274rust
AIDC-AI/Pixelle-VideoπŸš€ AI ε…¨θ‡ͺεŠ¨ηŸ­θ§†ι’‘εΌ•ζ“Ž | AI Fully Automated Short Video Engine1153python
rtk-ai/rtkCLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies732rust
1jehuang/jcodeCoding Agent Harness548rust
czlonkowski/n8n-mcpA MCP for Claude Desktop / Claude Code / Windsurf / Cursor to build n8n workflows for you496typescript
virattt/dexterAn autonomous agent for deep financial research409typescript
mksglu/context-modeContext window optimization for AI coding agents. Sandboxes tool output, 98% reduction. 14 platforms306typescript
withastro/flueThe sandbox agent framework.290typescript

πŸ“„ New Papers

TitleCategoryHotnessLink
MolmoAct2: Action Reasoning Models for Real-world Deploymentresearch_paper65Open
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMsresearch_paper12Open
Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modelingresearch_paper11Open
OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Modelsresearch_paper9Open
Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Surveycs.AI0Open
Representation in large language modelscs.AI0Open
AcademiClaw: When Students Set Challenges for AI Agentsresearch_paper3Open
TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Datacs.AI0Open
AgentReputation: A Decentralized Agentic AI Reputation Frameworkcs.AI0Open
Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Modelscs.AI0Open
Are Tools All We Need? Unveiling the Tool-Use Tax in LLM Agentscs.AI0Open
TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimizationcs.AI0Open
ARMOR 2025: A Military-Aligned Benchmark for Evaluating Large Language Model Safety Beyond Civilian Contextscs.AI0Open
Causal Foundations of Collective Agencycs.AI0Open
Agentic AI for Trip Planning Optimization Applicationcs.AI0Open

🏒 Lab Blog Posts

🐦 Twitter/X Highlights

AccountTweet Summary
xaiTwo voices. One human. One AI. Can you guess the AI clone? πŸ‘‡ Voice cloning, rich with natural emotion, is now live on the Grok Voice API. http://x.ai/news/grok-custom-voices Post
xaiVoice Cloning is now live via the xAI API! Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, video game characters, and more. http://x.ai/news/grok-custom-voices Post
MistralAIπŸ†• Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI. 🌎 Enterprise teams have capable models. What they don't have is a way to run them reliably in production. That's the gap Workflows fills. It takes AI-powered business processes from prototype to production, with the durability, observability, and fault tolerance that production actually requires. Leading organisations like ASML, ABANCA, CMA-CGM, France Travail, La Banque Postale, Moeve, and many others are already using Workflows to automate critical processes. Post
MistralAIMistral AI made the TIME100 Most Influential Companies list for 2026 β€” and the top 10 for AI. Why we're proud: customers run frontier models in production on their own terms, on their own infrastructure. Thank you to our customers for their trust and for joining us on the journey. Grateful to our incredible team members around the world and congrats to all the businesses recognized this year. Learn more at: https://time.com/collection/time100-most-influential-companies/2026/mistral/ #TIME100Companies #TIME100CompaniesIndustryLeader Post
karpathyFireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was impossible with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base and 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors. Post
samapretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI Post