π΄ High Significance
Model Releases
π΄ Introducing the Heretic Grimoire: The takedown-resilient, local-first backup system that keeps uncensored models available forever β score 96
Sources: reddit/r/LocalLLaMA
Welcome to another episode of THE HERETIC SHOW, where authoritarian dreams are destroyed by unreasonably effective linear algebra! Let's start with an important announcement: ## Heretic now has an official website at https://heretic-project.org This website contains: * Links to all official resource
π΄ Xiaomi is now serving MiMo V2.5 at 1000-3000tps using DFlash & Persistent kernel. DFLash model is out, open-source release promised coming soon β score 75
Sources: reddit/r/LocalLLaMA
Developer Tools
π΄ OpenHands/OpenHands β π OpenHands: AI-Driven Development β score 87
Sources: github_trending
π OpenHands: AI-Driven Development
π΄ Looking for actual builders: n8n, LangChain & Multi-Agent systems β score 81
Sources: reddit/r/AIAgents
Hey everyone. Iβm currently putting together a dedicated technical team focused entirely on heavy AI automation and agentic infrastructure. We are building out complex multi-agent systems, and I'm looking for people who actually know what they're doing under the hood. If youβre the kind of engineer
π΄ Ar9av/obsidian-wiki β Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern β score 78
Sources: github_trending
Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern
π΄ State sharing between agents is harder than it looks β score 74
Sources: reddit/r/AIAgents
We built a multi-agent demo last month with three agents: one plans architecture, one writes code, and one reviews tests. The theory was clean division of labor. The reality was a mess of context loss as each agent started its own session and lost the accumulated reasoning. Agent A decided to use Pr
π΄ tensorzero/tensorzero β TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation. β score 73
Sources: github_trending
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
Other Signals
π΄ How does the ML community view evolutionary algorithm research? Career implications of an EA PhD? [D] β score 94
Sources: reddit/r/MachineLearning
How does the ML research community feel about evolutionary algorithms? Should I do a PhD in this area? Quick remark: I know some people in the ML community dunk on evolutionary algorithms because thereβs often a better optimizer, but they do have their place, which is what researchers in my communit
π΄ z.ai Poll on X: MIT-licensed open weights are losing β score 89
Sources: reddit/r/LocalLLaMA
You can cast your vote here: https://x.com/ZixuanLi_/status/2065646648777416770#m Just to be clear: I am not urging or brigading anyone to vote specifically for MIT-licensed open weights. Please choose the option you genuinely prefer. I previo
π΄ Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model β score 83
Sources: hackernews
π΄ Nex claims Rio 3.5 is Nex 2.5 PRO in trench coat β score 82
Sources: reddit/r/LocalLLaMA
π΄ Quant firms at ICML 2026 [D] β score 81
Sources: reddit/r/MachineLearning
I noted that in ICML 2026, quant firms are flocking and sponsoring as Diamond sponsors. Any reason? Source: https://icml.cc/sponsors/sponsors-list?year=2026at
π‘ Notable
Model Releases
π‘ Claude code + web search API provides excellent leads based on signals from reddit , Linkedin , G2 etc platforms β score 69
Sources: reddit/r/AIAgents
Claude Code's built-in web search is great for docs and code lookups. but It starts struggling when you need Reddit discussions, GitHub signals, StackOverflow threads, G2 reviews, and broader web intelligence. Perplexity solves the web access problem. Claude Code solves the agent problem. So we conn
π‘ EAGLE support merged into llama.cpp β score 68
Sources: reddit/r/LocalLLaMA
π‘ Command A Plus GGUFs posted β score 57
Sources: reddit/r/LocalLLaMA
Support for Command A Plus and North Mini Code was added to llama.cpp this weekend. Unsloth has North Mini Code GGUFs, but I didnβt find anyone with up to date GGUFs for Command A Plus, so I converted and quantized it!
π‘ Introducing the OpenAI Partner Network β score 50
Sources: lab_blog/OpenAI
OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.
Developer Tools
π‘ openinterpreter/openinterpreter β A lightweight coding agent for open models like Deepseek, Kimi, and Qwen β score 54
Sources: github_trending
A lightweight coding agent for open models like Deepseek, Kimi, and Qwen
π‘ AUTOMATIC1111/stable-diffusion-webui β Stable Diffusion web UI β score 51
Sources: github_trending
Stable Diffusion web UI
Research Papers
π‘ RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling β score 55
Sources: huggingface
Video generation models based on Diffusion Transformers (DiTs) have achieved remarkable performance in video synthesis, yet they suffer from high inference latency and computational costs due to the quadratic complexity of 3D attention. Existing acceleration methods primarily reduce computational co
π‘ P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning β score 45
Sources: huggingface
Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D generation powered by their priors, world knowledge and reasoning. Yet existing benchmarks rarely evaluate 3D modeling through code. Such modeling
π‘ AdaSR: Adaptive Streaming Reasoning with Hierarchical Relative Policy Optimization β score 45
Sources: huggingface Β· arxiv/cs.CL
Large reasoning models typically follow a read-then-think paradigm: they observe the complete input, reason over a static context, and then produce the answer. Yet many real-world scenarios are inherently dynamic, such as audio and video stream, where information arrives as a continuous stream and m
Other Signals
π‘ Qwen 3.6 35B-A3B @ Q4 or Gemma 4 12B @ Q8? β score 57
Sources: reddit/r/LocalLLaMA
Wondering how much model quantization matters here. Daily driver on my 32gb unified memory setup is the qwen model outputting ~15 tokens a second. Heard good things about the 12B Gemma 4 model so interested in trying it against my codebase. Given its size I can very comfortably fit the Q8 in. Hell,
π‘ Apple Foundation Models β score 50
Sources: hackernews
π‘ What's the lesson chat? β score 46
Sources: reddit/r/LocalLLaMA
π‘ ICML Poster [D] β score 44
Sources: reddit/r/MachineLearning
Does anyone know when is the ICML poster deadline? It says itβs tomorrow but is it AoE?
π’ Incremental
Model Releases
π’ An agent that plans with a frontier model but runs most of tokens locally (built it for my own dual-3090 rig) β score 18
Sources: reddit/r/LocalLLaMA
For the past couple of months, I've been building a tool for my personal use. I have a dual RTX 3090 system which I wanted to use but the qwen 3.5/3.6 27B and Gemma 4 31B while being really good, just didn't have the taste or the ability that a frontier model has. OTOH, frontier models are expensive
π’ UI/svg block rendering by ServeurpersoCom Β· Pull Request #24080 Β· ggml-org/llama.cpp β score 4
Sources: reddit/r/LocalLLaMA
watch the video to see SVG fun
Developer Tools
π’ I built an open-source Knowledge Graph pipeline with hybrid retrieval to improve LLM multi-hop reasoning [P] β score 36
Sources: reddit/r/MachineLearning
Hey everyone, I built an open-source full-stack pipeline (Django + React) that constructs a Knowledge Graph from raw text, detects thematic communities, and uses hybrid search to solve the "lost in the middle" problem in standard vector retrieval. The Pipeline: 1. Ingestion & Chunking: R
π’ nrwl/nx β The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time. β score 32
Sources: github_trending
The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time.
π’ I stopped connecting my Gmail to AI agents. Gave each agent its own email instead. β score 31
Sources: reddit/r/AIAgents
Was about to plug my Gmail into an AI agent so it could deal with some recurring email for me. Then I actually thought about what I was doing: handing it read access to my entire inbox - every personal thread, every password reset, every "your statement is ready" - just so it could handle maybe thre
π’ Your AI agent just blamed the network team. Now what? β score 31
Sources: reddit/r/AIAgents
https://leaddev.com/ai/your-ai-agent-just-blamed-the-network-team-now-what
π’ What can I build on multi agent collab? (That can solve real problems) β score 31
Sources: reddit/r/AIAgents
Hello guys, hope you all are doing well. Iβve been working on a side project lately and wanted to get some opinions and ideas on what to work on next. Itβs something around multiagent collab. So far, I was able to build custom agents using LangGraph. My agents have their own custom capabilities,
Omitted 3 additional developer tools items from the main section; see raw data and source-specific sections below.
Research Papers
π’ WaveDiT: Distribution-Aware Wavelet Flow Matching for Efficient 3D Brain MRI Synthesis β score 20
Sources: huggingface
Large and demographically balanced datasets are essential for reliable neuroimaging biomarkers. Full-resolution 3D brain MRI synthesis can support data augmentation in this setting, but existing approaches either incur prohibitive computational cost at volumetric scale or rely on lossy latent compre
Other Signals
π’ Voice-to-voice chatbot update β score 39
Sources: reddit/r/LocalLLaMA
I've been working on this after hours for a few months continuously improving it. Now at a point where the chatbot is close to real-time (thanks to SSE streaming) and also interruptible while preserving context of what was last said. 100% local and powered by Qwen3.5-397B (Unsloth's UD-Q3_K_XL), W
π’ Nemotron - King of the Deep? Comparison of 4 models <=120B β score 32
Sources: reddit/r/LocalLLaMA
Comparison was done on Strix Halo 128gb shared memory, Ubuntu 26.04, Lemonade Server, Vulkan backend. I often run larger models like gpt-oss 120B or qwen but their performance seems to degrate quickly once in deep waters... ah.. deep context. The most important quality to me is prompt processing - w
π’ Why do frontier AI labs send so many people to conferences? [D] β score 19
Sources: reddit/r/MachineLearning
Recent years I see plenty of folks from OpenAI and Anthropic attending conferences like ICML/Neurips, yet obviously few are presenting. Are they mainly recruiting? Following emerging research? Curious if anyone with firsthand experience can shed some light on how attendance is justified internally a
π’ Worth going to ICML during ACL? [D] β score 19
Sources: reddit/r/MachineLearning
I have a main paper in ACL and a workshop paper in ICML. I'm looking for jobs in U.S. as a graduating student. Would it be worth going to ICML after ACL presentation such that I have more chance to network? ACL is in San Diego and ICML is in Korea, if it changes things.
π’ Anthropic flies staff to D.C. to clean up White House fight β score 17
Sources: hackernews
Omitted 1 additional other signals items from the main section; see raw data and source-specific sections below.
π Trending Repos
| Repo | Description | Stars Today | Language |
|---|---|---|---|
| OpenHands/OpenHands | π OpenHands: AI-Driven Development | 230 | python |
| Ar9av/obsidian-wiki | Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern | 135 | python |
| tensorzero/tensorzero | TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation. | 96 | rust |
| openinterpreter/openinterpreter | A lightweight coding agent for open models like Deepseek, Kimi, and Qwen | 31 | python |
| AUTOMATIC1111/stable-diffusion-webui | Stable Diffusion web UI | 30 | python |
| nrwl/nx | The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time. | 15 | typescript |
π New Papers
| Title | Category | Hotness | Link |
|---|---|---|---|
| RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling | research_paper | 6 | Open |
| A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem | cs.AI | 0 | Open |
| UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems | cs.AI | 0 | Open |
| History of the Muddy Children Puzzle | cs.AI | 0 | Open |
| Orchestra-o1: Omnimodal Agent Orchestration | cs.AI | 0 | Open |
| Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher | cs.AI | 0 | Open |
| WorkBench Revisited: Workplace Agents Two Years On | cs.AI | 0 | Open |
| Refusal Beyond a Single Direction: A Preliminary Comparison of Diff-in-Means and INLP | cs.AI | 0 | Open |
| YeasierAgent: Agentic Social Sandbox as a Canvas for Intent-Driven Creation of Platform-Agnostic Symbiotic Agent-Native Applications | cs.AI | 0 | Open |
| TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards | cs.AI | 0 | Open |
| When Sample Selection Bias Precipitates Model Collapse | cs.AI | 0 | Open |
| AI Receptivity or AI Adoption Breadth? A Tool-Specific Reanalysis of the Lower-Literacy/Higher-Usage Link | cs.AI | 0 | Open |
| MA-ProofBench: A Two-Tiered Evaluation of LLMs for Theorem Proving in Mathematical Analysis | cs.AI | 0 | Open |
| Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs | cs.AI | 0 | Open |
| Hyperdimensional computing for structured querying on tabular data embeddings | cs.AI | 0 | Open |
π’ Lab Blog Posts
Repeated From Recent Briefings
- NVIDIA/SkillSpector β Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks. - first seen 2026-06-10
- APPO: Agentic Procedural Policy Optimization - first seen 2026-06-11
- andrewyng/aisuite β Simple, unified interface to multiple Generative AI providers - first seen 2026-06-14
- LMCache/LMCache β LMCache: Supercharge Your LLM with the Fastest KV Cache Layer - first seen 2026-06-13
- shiyu-coder/Kronos β Kronos: A Foundation Model for the Language of Financial Markets - first seen 2026-05-07
- Rethinking RAG in Long Videos: What to Retrieve and How to Use It? - first seen 2026-06-12
- tinyhumansai/openhuman β Your Personal AI super intelligence. Private, Simple and extremely powerful. - first seen 2026-05-11
- FareedKhan-dev/train-llm-from-scratch β A straightforward method for training your LLM, from downloading data to generating text. - first seen 2026-06-11
- Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO - first seen 2026-06-01
- qdrant/qdrant β Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloudhttps://cloud.qdrant.io/ - first seen 2026-05-07
- ... plus 86 more repeated items in processed data