π΄ High Significance
Model Releases
π΄ White House Considers Vetting A.I. Models Before They Are Released β score 81
Sources: reddit/r/LocalLLaMA
π΄ Peanut - Text to Image Model (Open Weights coming soon) β score 73
Sources: reddit/r/LocalLLaMA
A new anonymous model debuts at #8 in the Artificial Analysis Text to Image Arena! Peanutβs weights are expected to be released soon, which would make it the leading Text to Image Open Weights Model.
Peanut is positioned to be the new leading open weights Text to Image model, surpassing Z-I
Developer Tools
π΄ ruvnet/ruflo β π The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration β score 99
Sources: github_trending
π The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration
π΄ TauricResearch/TradingAgents β TradingAgents: Multi-Agents LLM Financial Trading Framework β score 97
Sources: github_trending
TradingAgents: Multi-Agents LLM Financial Trading Framework
π΄ Hmbown/DeepSeek-TUI β Coding agent for DeepSeek models that runs in your terminal β score 95
Sources: github_trending
Coding agent for DeepSeek models that runs in your terminal
π΄ Are modern ML PhDs becoming too incremental, or is this just what research looks like now? [D] β score 94
Sources: reddit/r/MachineLearning
Iβve been thinking about the current state of machine learning PhDs, including my own work, and Iβd like to hear how others see it.
My impression is that a large fraction of modern ML PhD work follows a fairly predictable pattern: take an existing idea, connect it to another existing idea, apply i
π΄ My nerd lil brother surpassed me in his monthly income last month! β score 94
Sources: reddit/r/AIAgents
He's kinda a nerd yeah. I was teaching him some basics in AI since he turned 16 but fast forward he found out he can make money off this. I replied "Im sure you do", jokingly because I didn't think he actually is able to find some clients. Hes now 17 and thanks to an automation he built on n8n, he g
Omitted 12 additional developer tools items from the main section; see raw data and source-specific sections below.
Research Papers
π΄ MolmoAct2: Action Reasoning Models for Real-world Deployment β score 95
Sources: huggingface
Vision-Language-Action (VLA) models aim to provide a single generalist controller for robots, but today's systems fall short on the criteria that matter for real-world deployment. Frontier models are closed, open-weight alternatives are tied to expensive hardware, reasoning-augmented policies pay pr
π΄ Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs β score 78
Sources: huggingface Β· arxiv/cs.AI
While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function, causing visual attention to decay inversely with gene
π΄ Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling β score 72
Sources: huggingface Β· arxiv/cs.AI
Recent research has shown that filtering massive English web corpora into high-quality subsets significantly improves training efficiency. However, for high-resource non-English languages like German, French, or Japanese, aggressive filtering creates a strategic dilemma: should practitioners priorit
Other Signals
π΄ it's time to update your Gemma 4 GGUFs β score 88
Sources: reddit/r/LocalLLaMA
Chat Template was fixed a few days ago
choose your fav dealer:
https://huggingface.co/bartowski/google_gemma-4-31B-it-GGUF
[https://huggingface.co/bartowski/google_gemma-4-26B-A4B-it-GGUF](https://huggingface.co/bartowski/google_gem
π΄ How OpenAI delivers low-latency voice AI at scale β score 88
Sources: hackernews
π‘ Notable
Model Releases
π‘ FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 β score 65
Sources: reddit/r/LocalLLaMA
Last year researchers affiliated with NVIDIA, University of Warsaw, and University of Edinburgh published Dynamic Memory Sparsification (DMS), a KV-cache sparsification technique using learned per-head token eviction, reporting up to 8x KV-cache compression.
I fo
π‘ Claude Code structure that didnβt break after 2β3 real projects β score 61
Sources: reddit/r/AIAgents
Been iterating on my Claude Code setup for a while. Most examples online worked⦠until things got slightly complex. This is the first structure that held up once I added multiple skills, MCP servers, and agents.
What actually made a difference:
- **If youβre skipping CLAUDE MD, thatβs probably the
π‘ **[@xai: Two voices. One human. One AI. Can you guess the AI clone? π
Voice cloning, rich with natural emotion, is now live on the Grok Voice API.
http://x.ai/news/grok-custom-voices](https://x.com/xai/status/2051438210065322244)** β score 60
Sources: twitter_rss
Two voices. One human. One AI. Can you guess the AI clone? π
Voice cloning, rich with natural emotion, is now live on the Grok Voice API.
http://x.ai/news/grok-custom-voices
π‘ **[@xai: Voice Cloning is now live via the xAI API!
Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, vide](https://x.com/xai/status/2050355373052223585)** β score 60
Sources: twitter_rss
Voice Cloning is now live via the xAI API!
Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, video game characters, and more.
http://x.ai/news/grok-custom-voices
π‘ **[@MistralAI: π Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI.
π Enterprise teams have capable models. What they don't have is a way to run them reliably in prod](https://x.com/MistralAI/status/2049128071874179091)** β score 60
Sources: twitter_rss
π Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI.
π Enterprise teams have capable models. What they don't have is a way to run them reliably in production. That's the gap Workflows fills. It takes AI-powered business processes from prototype to pro
Omitted 2 additional model releases items from the main section; see raw data and source-specific sections below.
Developer Tools
π‘ LearningCircuit/local-deep-research β ~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted. β score 68
Sources: github_trending
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.
π‘ cocoindex-io/cocoindex β Incremental engine for long horizon agents π Star if you like it! β score 66
Sources: github_trending
Incremental engine for long horizon agents π Star if you like it!
π‘ Agent Skills β score 62
Sources: hackernews
π‘ mnfst/manifest β Smart Model Routing for Agents. Cut Costs up to 70% π¦ β score 60
Sources: github_trending
Smart Model Routing for Agents. Cut Costs up to 70% π¦
π‘ How do you experiment with a (very) large model architecture? [D] β score 56
Sources: reddit/r/MachineLearning
Im trying to reproduce a paper (a very particular kind of diffusion model), and their training regime is incredibly compute heavy.
In general, how are quick experiments performed to validate hypotheses when the models are large and compute is expensive?
Some cursory browsing yields the following:
Omitted 6 additional developer tools items from the main section; see raw data and source-specific sections below.
Infrastructure & Compute
π‘ Lightricks/LTX-2 β Official Python inference and LoRA trainer package for the LTX-2 audioβvideo generative model. β score 53
Sources: github_trending
Official Python inference and LoRA trainer package for the LTX-2 audioβvideo generative model.
Research Papers
π‘ OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models β score 68
Sources: huggingface Β· arxiv/cs.CL
The vast and underexplored ocean plays a critical role in regulating global climate and supporting marine biodiversity, yet artificial intelligence has so far delivered limited impact in this domain due to a fundamental data bottleneck. Specifically, ocean data are highly fragmented across disparate
π‘ Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey β score 60
Sources: arxiv/cs.AI Β· arxiv/cs.LG
arXiv:2411.17429v2 Announce Type: replace-cross Abstract: Graph Neural Networks are powerful models for learning from graph-structured data, yet their effectiveness is often limited by two critical challenges: over-squashing, where information from distant nodes is excessively compressed, and over-
π‘ Representation in large language models β score 60
Sources: arxiv/cs.AI Β· arxiv/cs.LG
arXiv:2501.00885v2 Announce Type: replace-cross Abstract: The extraordinary success of recent Large Language Models (LLMs) on a diverse array of tasks has led to an explosion of scientific and philosophical theorizing aimed at explaining how they do what they do. Unfortunately, disagreement over fu
π‘ AcademiClaw: When Students Set Challenges for AI Agents β score 55
Sources: huggingface
Benchmarks within the OpenClaw ecosystem have thus far evaluated exclusively assistant-level tasks, leaving the academic-level capabilities of OpenClaw largely unexamined. We introduce AcademiClaw, a bilingual benchmark of 80 complex, long-horizon tasks sourced directly from university students' rea
π‘ Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation β score 48
Sources: huggingface Β· arxiv/cs.AI
Retrieval-augmented generation (RAG) enhances large language models with external knowledge, and tree-based RAG organizes documents into hierarchical indexes to support queries at multiple granularities. However, existing Tree-RAG methods designed for single-document retrieval face critical challeng
Other Signals
π‘ **[@sama: pretty excited for voice models to get great
its interesting to watch how people are already starting to change the way they interface with AI](https://x.com/sama/status/2051464865634742334)** β score 50
Sources: twitter_rss
pretty excited for voice models to get great
its interesting to watch how people are already starting to change the way they interface with AI
π‘ DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark β 10 weeks later, ~17Γ cheaper β score 42
Sources: reddit/r/LocalLLaMA
Tested DeepSeek V4 Pro on FoodTruck Bench β our 30-day agentic benchmark where models run a food truck via 34 tools (locations, pricing, inventory, staff, weather, events) with persistent memory and daily reflection.
First Chinese model to land in the frontier tier on our benchmark. Tied with Grok
π’ Incremental
Model Releases
π’ As MTP prepares to land in llama.cpp, Models that support MTP β score 19
Sources: reddit/r/LocalLLaMA
DeepSeekv3 OG
DeepSeekv3.2/4
Qwen3.5
GLM4.5+
MiniMax2.5+
Step3.5Flash
Mimo v2+
Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.
Developer Tools
π’ openclaw/acpx β Headless CLI client for stateful Agent Client Protocol (ACP) sessions β score 36
Sources: github_trending
Headless CLI client for stateful Agent Client Protocol (ACP) sessions
π’ danielmiessler/Personal_AI_Infrastructure β Agentic AI Infrastructure for magnifying HUMAN capabilities. β score 29
Sources: github_trending
Agentic AI Infrastructure for magnifying HUMAN capabilities.
π’ vibevoice.cpp: Microsoft VibeVoice (TTS + long-form ASR with diarization) ported to ggml/C++, runs on CPU/CUDA/Metal/Vulkan, no Python at inference β score 27
Sources: reddit/r/LocalLLaMA
A few weeks ago I shipped vibevoice.cpp, a pure-C++ ggml port of Microsoft
VibeVoice (the speech-to-speech model with voice cloning, https://github.com/microsoft/VibeVoice). Wanted to post a follow-up here because we're at a point where the engine has gro
π’ xingkongliang/skills-manager β A lightweight desktop app to manage, sync, and organize AI agent skills across 15+ coding tools β Cursor, Claude Code, Codex, Copilot, and more. β score 27
Sources: github_trending
A lightweight desktop app to manage, sync, and organize AI agent skills across 15+ coding tools β Cursor, Claude Code, Codex, Copilot, and more.
π’ My electric bill doubled running local models β score 17
Sources: reddit/r/AIAgents
Been running my side project's social presence usingΒ Mangos.aiΒ - it's a no code agent builder for product distribution.
I run these agents 24/7 and they do a phenomenal job. They have access to my browser with all my socials logged in. Every few hours they go into X, Threads,
Omitted 6 additional developer tools items from the main section; see raw data and source-specific sections below.
Research Papers
π’ Code World Model Preparedness Report β score 25
Sources: huggingface
This report documents the preparedness assessment of Code World Model (CWM), a model for code generation and reasoning about code from Meta. We conducted pre-release testing across domains identified in our Frontier AI Framework as potentially presenting catastrophic risks, and also evaluated the mo
π’ Motion-Aware Caching for Efficient Autoregressive Video Generation β score 25
Sources: huggingface
Autoregressive video generation paradigms offer theoretical promise for long video synthesis, yet their practical deployment is hindered by the computational burden of sequential iterative denoising. While cache reuse strategies can accelerate generation by skipping redundant denoising steps, existi
π’ Generative Modeling with Orbit-Space Particle Flow Matching β score 25
Sources: huggingface
We present Orbit-Space Geometric Probability Paths (OGPP), a particle-native flow-matching framework for generative modeling of particle systems. OGPP is motivated by two insights: (i) particles are defined up to permutation symmetries, so anonymous indexing inflates per-index target variance and yi
π’ Perceptual Flow Network for Visually Grounded Reasoning β score 25
Sources: huggingface
Despite the success of Large-Vision Language Models (LVLMs), general optimization objectives (e.g., standard MLE) fail to constrain visual trajectories, leading to language bias and hallucination. To mitigate this, current methods introduce geometric priors from visual experts as additional supervis
Other Signals
π’ Is there a notable increase in demand for privacy-preserving AI/ML with the advent of LLMs? [D] β score 39
Sources: reddit/r/MachineLearning
While browsing through this subreddit, I encountered this old discussion post about demand for AI with the rise of privacy regulation. It got me thinking that, 6 years on, the demand for AI
π’ Train Your Own LLM from Scratch β score 38
Sources: hackernews
π’ MTPLX | 2.24x faster TPS | The native MTP inference engine for Apple Silicon β score 35
Sources: reddit/r/LocalLLaMA
TLDR: 28 tok/s β 63 tok/s on Qwen3.6-27B on a MacBook Pro M5 Max. 2.24Γ faster at real temperature 0.6.
Works for coding, creative writing, and chat
https://i.redd.it/i9x794c0q7zg1.gif
- Works on ANY MTP model: No external drafter. No extra memory usage. Uses the model's own built-in MTP he
π’ [D] What Happened to Neurips Creative AI Track? [R] β score 31
Sources: reddit/r/MachineLearning
At Neurips 2025, the Creative AI Track was announced as part of the official proceedings:
https://neurips.cc/Conferences/2025/CallForCreativeAI
"Please note that this year the Creative AI track will be part of the NeurIPS conference
π’ Building a 9-ball AI player: Candidate generation for direct cut shots [P] β score 24
Sources: reddit/r/MachineLearning
I'm building a 9-ball-player to help with pattern play. There are many ways to make the next ball, and sometimes in more than one obvious pocket. Which should should you choose depends on probability of making that shot AND ending up in a favorable spot for the next shot, that is also amenable to
Omitted 4 additional other signals items from the main section; see raw data and source-specific sections below.
π Trending Repos
| Repo | Description | Stars Today | Language |
|---|---|---|---|
| ruvnet/ruflo | π The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integration, and native Claude Code / Codex Integration | 2598 | typescript |
| TauricResearch/TradingAgents | TradingAgents: Multi-Agents LLM Financial Trading Framework | 2182 | python |
| Hmbown/DeepSeek-TUI | Coding agent for DeepSeek models that runs in your terminal | 1274 | rust |
| AIDC-AI/Pixelle-Video | π AI ε ¨θͺε¨ηθ§ι’εΌζ | AI Fully Automated Short Video Engine | 1153 | python |
| rtk-ai/rtk | CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies | 732 | rust |
| 1jehuang/jcode | Coding Agent Harness | 548 | rust |
| czlonkowski/n8n-mcp | A MCP for Claude Desktop / Claude Code / Windsurf / Cursor to build n8n workflows for you | 496 | typescript |
| virattt/dexter | An autonomous agent for deep financial research | 409 | typescript |
| mksglu/context-mode | Context window optimization for AI coding agents. Sandboxes tool output, 98% reduction. 14 platforms | 306 | typescript |
| withastro/flue | The sandbox agent framework. | 290 | typescript |
π New Papers
| Title | Category | Hotness | Link |
|---|---|---|---|
| MolmoAct2: Action Reasoning Models for Real-world Deployment | research_paper | 65 | Open |
| Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs | research_paper | 12 | Open |
| Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling | research_paper | 11 | Open |
| OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models | research_paper | 9 | Open |
| Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey | cs.AI | 0 | Open |
| Representation in large language models | cs.AI | 0 | Open |
| AcademiClaw: When Students Set Challenges for AI Agents | research_paper | 3 | Open |
| TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data | cs.AI | 0 | Open |
| AgentReputation: A Decentralized Agentic AI Reputation Framework | cs.AI | 0 | Open |
| Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Models | cs.AI | 0 | Open |
| Are Tools All We Need? Unveiling the Tool-Use Tax in LLM Agents | cs.AI | 0 | Open |
| TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization | cs.AI | 0 | Open |
| ARMOR 2025: A Military-Aligned Benchmark for Evaluating Large Language Model Safety Beyond Civilian Contexts | cs.AI | 0 | Open |
| Causal Foundations of Collective Agency | cs.AI | 0 | Open |
| Agentic AI for Trip Planning Optimization Application | cs.AI | 0 | Open |
π’ Lab Blog Posts
π¦ Twitter/X Highlights
| Account | Tweet Summary |
|---|---|
| xai | Two voices. One human. One AI. Can you guess the AI clone? π Voice cloning, rich with natural emotion, is now live on the Grok Voice API. http://x.ai/news/grok-custom-voices Post |
| xai | Voice Cloning is now live via the xAI API! Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, video game characters, and more. http://x.ai/news/grok-custom-voices Post |
| MistralAI | π Today, we're releasing the public preview of Workflows, the orchestration layer for enterprise AI. π Enterprise teams have capable models. What they don't have is a way to run them reliably in production. That's the gap Workflows fills. It takes AI-powered business processes from prototype to production, with the durability, observability, and fault tolerance that production actually requires. Leading organisations like ASML, ABANCA, CMA-CGM, France Travail, La Banque Postale, Moeve, and many others are already using Workflows to automate critical processes. Post |
| MistralAI | Mistral AI made the TIME100 Most Influential Companies list for 2026 β and the top 10 for AI. Why we're proud: customers run frontier models in production on their own terms, on their own infrastructure. Thank you to our customers for their trust and for joining us on the journey. Grateful to our incredible team members around the world and congrats to all the businesses recognized this year. Learn more at: https://time.com/collection/time100-most-influential-companies/2026/mistral/ #TIME100Companies #TIME100CompaniesIndustryLeader Post |
| karpathy | Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was impossible with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base and 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors. Post |
| sama | pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI Post |