AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 Introducing the Heretic Grimoire: The takedown-resilient, local-first backup system that keeps uncensored models available forever — score 96 Sources: reddit/r/LocalLLaMA

Welcome to another episode of THE HERETIC SHOW, where authoritarian dreams are destroyed by unreasonably effective linear algebra! Let's start with an important announcement: ## Heretic now has an official website at https://heretic-project.org This website contains: * Links to all official resource

🔴 Xiaomi is now serving MiMo V2.5 at 1000-3000tps using DFlash & Persistent kernel. DFLash model is out, open-source release promised coming soon — score 75 Sources: reddit/r/LocalLLaMA

https://mimo.xiaomi.com/blog/mimo-tilert-1000tps

Developer Tools

🔴 OpenHands/OpenHands — 🙌 OpenHands: AI-Driven Development — score 87 Sources: github_trending

🙌 OpenHands: AI-Driven Development

🔴 Looking for actual builders: n8n, LangChain & Multi-Agent systems — score 81 Sources: reddit/r/AIAgents

Hey everyone. I’m currently putting together a dedicated technical team focused entirely on heavy AI automation and agentic infrastructure. We are building out complex multi-agent systems, and I'm looking for people who actually know what they're doing under the hood. If you’re the kind of engineer

🔴 Ar9av/obsidian-wiki — Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern — score 78 Sources: github_trending

Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern

🔴 State sharing between agents is harder than it looks — score 74 Sources: reddit/r/AIAgents

We built a multi-agent demo last month with three agents: one plans architecture, one writes code, and one reviews tests. The theory was clean division of labor. The reality was a mess of context loss as each agent started its own session and lost the accumulated reasoning. Agent A decided to use Pr

🔴 tensorzero/tensorzero — TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation. — score 73 Sources: github_trending

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

Other Signals

🔴 How does the ML community view evolutionary algorithm research? Career implications of an EA PhD? [D] — score 94 Sources: reddit/r/MachineLearning

How does the ML research community feel about evolutionary algorithms? Should I do a PhD in this area? Quick remark: I know some people in the ML community dunk on evolutionary algorithms because there’s often a better optimizer, but they do have their place, which is what researchers in my communit

🔴 z.ai Poll on X: MIT-licensed open weights are losing — score 89 Sources: reddit/r/LocalLLaMA

You can cast your vote here: https://x.com/ZixuanLi_/status/2065646648777416770#m Just to be clear: I am not urging or brigading anyone to vote specifically for MIT-licensed open weights. Please choose the option you genuinely prefer. I previo

🔴 Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model — score 83 Sources: hackernews

🔴 Nex claims Rio 3.5 is Nex 2.5 PRO in trench coat — score 82 Sources: reddit/r/LocalLLaMA

🔴 Quant firms at ICML 2026 [D] — score 81 Sources: reddit/r/MachineLearning

I noted that in ICML 2026, quant firms are flocking and sponsoring as Diamond sponsors. Any reason? Source: https://icml.cc/sponsors/sponsors-list?year=2026at

🟡 Notable

Model Releases

🟡 Claude code + web search API provides excellent leads based on signals from reddit , Linkedin , G2 etc platforms — score 69 Sources: reddit/r/AIAgents

Claude Code's built-in web search is great for docs and code lookups. but It starts struggling when you need Reddit discussions, GitHub signals, StackOverflow threads, G2 reviews, and broader web intelligence. Perplexity solves the web access problem. Claude Code solves the agent problem. So we conn

🟡 EAGLE support merged into llama.cpp — score 68 Sources: reddit/r/LocalLLaMA

🟡 Command A Plus GGUFs posted — score 57 Sources: reddit/r/LocalLLaMA

Support for Command A Plus and North Mini Code was added to llama.cpp this weekend. Unsloth has North Mini Code GGUFs, but I didn’t find anyone with up to date GGUFs for Command A Plus, so I converted and quantized it!

🟡 Introducing the OpenAI Partner Network — score 50 Sources: lab_blog/OpenAI

OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.

Developer Tools

🟡 openinterpreter/openinterpreter — A lightweight coding agent for open models like Deepseek, Kimi, and Qwen — score 54 Sources: github_trending

A lightweight coding agent for open models like Deepseek, Kimi, and Qwen

🟡 AUTOMATIC1111/stable-diffusion-webui — Stable Diffusion web UI — score 51 Sources: github_trending

Stable Diffusion web UI

Research Papers

🟡 RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling — score 55 Sources: huggingface

Video generation models based on Diffusion Transformers (DiTs) have achieved remarkable performance in video synthesis, yet they suffer from high inference latency and computational costs due to the quadratic complexity of 3D attention. Existing acceleration methods primarily reduce computational co

🟡 P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning — score 45 Sources: huggingface

Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D generation powered by their priors, world knowledge and reasoning. Yet existing benchmarks rarely evaluate 3D modeling through code. Such modeling

🟡 AdaSR: Adaptive Streaming Reasoning with Hierarchical Relative Policy Optimization — score 45 Sources: huggingface · arxiv/cs.CL

Large reasoning models typically follow a read-then-think paradigm: they observe the complete input, reason over a static context, and then produce the answer. Yet many real-world scenarios are inherently dynamic, such as audio and video stream, where information arrives as a continuous stream and m

Other Signals

🟡 Qwen 3.6 35B-A3B @ Q4 or Gemma 4 12B @ Q8? — score 57 Sources: reddit/r/LocalLLaMA

Wondering how much model quantization matters here. Daily driver on my 32gb unified memory setup is the qwen model outputting ~15 tokens a second. Heard good things about the 12B Gemma 4 model so interested in trying it against my codebase. Given its size I can very comfortably fit the Q8 in. Hell,

🟡 Apple Foundation Models — score 50 Sources: hackernews

🟡 What's the lesson chat? — score 46 Sources: reddit/r/LocalLLaMA

🟡 ICML Poster [D] — score 44 Sources: reddit/r/MachineLearning

Does anyone know when is the ICML poster deadline? It says it’s tomorrow but is it AoE?

🟢 Incremental

Model Releases

🟢 An agent that plans with a frontier model but runs most of tokens locally (built it for my own dual-3090 rig) — score 18 Sources: reddit/r/LocalLLaMA

For the past couple of months, I've been building a tool for my personal use. I have a dual RTX 3090 system which I wanted to use but the qwen 3.5/3.6 27B and Gemma 4 31B while being really good, just didn't have the taste or the ability that a frontier model has. OTOH, frontier models are expensive

🟢 UI/svg block rendering by ServeurpersoCom · Pull Request #24080 · ggml-org/llama.cpp — score 4 Sources: reddit/r/LocalLLaMA

watch the video to see SVG fun

Developer Tools

🟢 I built an open-source Knowledge Graph pipeline with hybrid retrieval to improve LLM multi-hop reasoning [P] — score 36 Sources: reddit/r/MachineLearning

Hey everyone, I built an open-source full-stack pipeline (Django + React) that constructs a Knowledge Graph from raw text, detects thematic communities, and uses hybrid search to solve the "lost in the middle" problem in standard vector retrieval. The Pipeline: 1. Ingestion & Chunking: R

🟢 nrwl/nx — The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time. — score 32 Sources: github_trending

The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time.

🟢 I stopped connecting my Gmail to AI agents. Gave each agent its own email instead. — score 31 Sources: reddit/r/AIAgents

Was about to plug my Gmail into an AI agent so it could deal with some recurring email for me. Then I actually thought about what I was doing: handing it read access to my entire inbox - every personal thread, every password reset, every "your statement is ready" - just so it could handle maybe thre

🟢 Your AI agent just blamed the network team. Now what? — score 31 Sources: reddit/r/AIAgents

https://leaddev.com/ai/your-ai-agent-just-blamed-the-network-team-now-what

🟢 What can I build on multi agent collab? (That can solve real problems) — score 31 Sources: reddit/r/AIAgents

Hello guys, hope you all are doing well. I’ve been working on a side project lately and wanted to get some opinions and ideas on what to work on next. It’s something around multiagent collab. So far, I was able to build custom agents using LangGraph. My agents have their own custom capabilities,

Omitted 3 additional developer tools items from the main section; see raw data and source-specific sections below.

Research Papers

🟢 WaveDiT: Distribution-Aware Wavelet Flow Matching for Efficient 3D Brain MRI Synthesis — score 20 Sources: huggingface

Large and demographically balanced datasets are essential for reliable neuroimaging biomarkers. Full-resolution 3D brain MRI synthesis can support data augmentation in this setting, but existing approaches either incur prohibitive computational cost at volumetric scale or rely on lossy latent compre

Other Signals

🟢 Voice-to-voice chatbot update — score 39 Sources: reddit/r/LocalLLaMA

I've been working on this after hours for a few months continuously improving it. Now at a point where the chatbot is close to real-time (thanks to SSE streaming) and also interruptible while preserving context of what was last said. 100% local and powered by Qwen3.5-397B (Unsloth's UD-Q3_K_XL), W

🟢 Nemotron - King of the Deep? Comparison of 4 models <=120B — score 32 Sources: reddit/r/LocalLLaMA

Comparison was done on Strix Halo 128gb shared memory, Ubuntu 26.04, Lemonade Server, Vulkan backend. I often run larger models like gpt-oss 120B or qwen but their performance seems to degrate quickly once in deep waters... ah.. deep context. The most important quality to me is prompt processing - w

🟢 Why do frontier AI labs send so many people to conferences? [D] — score 19 Sources: reddit/r/MachineLearning

Recent years I see plenty of folks from OpenAI and Anthropic attending conferences like ICML/Neurips, yet obviously few are presenting. Are they mainly recruiting? Following emerging research? Curious if anyone with firsthand experience can shed some light on how attendance is justified internally a

🟢 Worth going to ICML during ACL? [D] — score 19 Sources: reddit/r/MachineLearning

I have a main paper in ACL and a workshop paper in ICML. I'm looking for jobs in U.S. as a graduating student. Would it be worth going to ICML after ACL presentation such that I have more chance to network? ACL is in San Diego and ICML is in Korea, if it changes things.

🟢 Anthropic flies staff to D.C. to clean up White House fight — score 17 Sources: hackernews

Omitted 1 additional other signals items from the main section; see raw data and source-specific sections below.

Repo	Description	Stars Today	Language
OpenHands/OpenHands	🙌 OpenHands: AI-Driven Development	230	python
Ar9av/obsidian-wiki	Framework for AI agents to build and maintain a digital brain through Obsidian wiki using Karpathy's LLM Wiki pattern	135	python
tensorzero/tensorzero	TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.	96	rust
openinterpreter/openinterpreter	A lightweight coding agent for open models like Deepseek, Kimi, and Qwen	31	python
AUTOMATIC1111/stable-diffusion-webui	Stable Diffusion web UI	30	python
nrwl/nx	The Monorepo Platform that amplifies both developers and AI agents. Nx optimizes your builds, scales your CI, and fixes failed PRs automatically. Ship in half the time.	15	typescript

📄 New Papers

Title	Category	Hotness	Link
RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling	research_paper	6	Open
A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem	cs.AI	0	Open
UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems	cs.AI	0	Open
History of the Muddy Children Puzzle	cs.AI	0	Open
Orchestra-o1: Omnimodal Agent Orchestration	cs.AI	0	Open
Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher	cs.AI	0	Open
WorkBench Revisited: Workplace Agents Two Years On	cs.AI	0	Open
Refusal Beyond a Single Direction: A Preliminary Comparison of Diff-in-Means and INLP	cs.AI	0	Open
YeasierAgent: Agentic Social Sandbox as a Canvas for Intent-Driven Creation of Platform-Agnostic Symbiotic Agent-Native Applications	cs.AI	0	Open
TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards	cs.AI	0	Open
When Sample Selection Bias Precipitates Model Collapse	cs.AI	0	Open
AI Receptivity or AI Adoption Breadth? A Tool-Specific Reanalysis of the Lower-Literacy/Higher-Usage Link	cs.AI	0	Open
MA-ProofBench: A Two-Tiered Evaluation of LLMs for Theorem Proving in Mathematical Analysis	cs.AI	0	Open
Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs	cs.AI	0	Open
Hyperdimensional computing for structured querying on tabular data embeddings	cs.AI	0	Open

🏢 Lab Blog Posts

OpenAI: Introducing the OpenAI Partner Network

Repeated From Recent Briefings

NVIDIA/SkillSpector — Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks. - first seen 2026-06-10
APPO: Agentic Procedural Policy Optimization - first seen 2026-06-11
andrewyng/aisuite — Simple, unified interface to multiple Generative AI providers - first seen 2026-06-14
LMCache/LMCache — LMCache: Supercharge Your LLM with the Fastest KV Cache Layer - first seen 2026-06-13
shiyu-coder/Kronos — Kronos: A Foundation Model for the Language of Financial Markets - first seen 2026-05-07
Rethinking RAG in Long Videos: What to Retrieve and How to Use It? - first seen 2026-06-12
tinyhumansai/openhuman — Your Personal AI super intelligence. Private, Simple and extremely powerful. - first seen 2026-05-11
FareedKhan-dev/train-llm-from-scratch — A straightforward method for training your LLM, from downloading data to generating text. - first seen 2026-06-11
Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO - first seen 2026-06-01
qdrant/qdrant — Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloudhttps://cloud.qdrant.io/ - first seen 2026-05-07
... plus 86 more repeated items in processed data

AI Watchtower Briefing — 2026-06-15

🔴 High Significance

Model Releases

Developer Tools

Other Signals

🟡 Notable

Model Releases

Developer Tools

Research Papers

Other Signals

🟢 Incremental

Model Releases

Developer Tools

Research Papers

Other Signals

📈 Trending Repos

📄 New Papers

🏢 Lab Blog Posts

Repeated From Recent Briefings