AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 MTP PR Merged!!! — score 97 Sources: reddit/r/LocalLLaMA

Llamas, LFG!!! 🎉🎉🎉

🔴 That's a good news... — score 90 Sources: reddit/r/LocalLLaMA

Looks like it finally happens... MTP getting approved for llama.cpp. Time to prepare for the update.

🔴 AI Agents are pushing us towards Transactional Development — Spin Up, Build, Ship, Discard. — score 89 Sources: reddit/r/AIAgents

AI coding agents changed the relationship between developers and their machines. Tools like Claude Code and Cursor don't just suggest code anymore - they run commands, install packages, edit files, and operate your terminal. Your machine is no longer just where you write code. It's part of the execu

🔴 Anthropic and OpenAI claims that their models are so powerful that it can “break” their box…but what so special about their agent implementation? — score 89 Sources: reddit/r/AIAgents

Anthropic and OpenAI claims that their models are so powerful that it can “break” their box…but what so special about their agent implementation? Is it not just basic ReAct loops with tools? I am wondering what is the gap between my little Ollama local model implementation and their implementation.

🔴 MTP support merged into llama.cpp — score 83 Sources: reddit/r/LocalLLaMA

PR 22673 has been merged into master! 🎉

Omitted 1 additional model releases items from the main section; see raw data and source-specific sections below.

Developer Tools

🔴 KeygraphHQ/shannon — Shannon Lite is an autonomous, white-box AI pentester for web applications and APIs. It analyzes your source code, identifies attack vectors, and executes real exploits to prove vulnerabilities before they reach production. — score 81 Sources: github_trending

Shannon Lite is an autonomous, white-box AI pentester for web applications and APIs. It analyzes your source code, identifies attack vectors, and executes real exploits to prove vulnerabilities before they reach production.

🔴 HKUDS/CLI-Anything — "CLI-Anything: Making ALL Software Agent-Native" -- CLI-Hub:https://clianything.cc/ — score 79 Sources: github_trending

"CLI-Anything: Making ALL Software Agent-Native" -- CLI-Hub:https://clianything.cc/

🔴 dograh-hq/dograh — Open Source Voice Agent Platform — score 77 Sources: github_trending

Open Source Voice Agent Platform

🔴 Zerostack – A Unix-inspired coding agent written in pure Rust — score 70 Sources: hackernews

Other Signals

🔴 Strix Halo Llama.cpp MTP Benchmarks: 27B Gets Much Faster, 35B Is Mixed — score 70 Sources: reddit/r/LocalLLaMA

TL;DR All models were Qwen3.6 27B-MTP vs Base 27B (15k single-turn): Faster overall * Total Time (wall): 87.44s → 77.39s (10.05s faster / -11.50%) * Generation: 7.63 → 16.15 t/s (+111.77% speedup) * Prompt Processing: 279.75 → 244.90 t/s (-12.46% slowdown) **35B-MTP vs Ba

🟡 Notable

Model Releases

🟡 some things i learned the hard way using claude design — score 56 Sources: reddit/r/AIAgents

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own. first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompt

🟡 G4-Meromero-31B-Uncensored-Heretic Is Out Now, a Finetune of Gemma 4 31B It Designed for Creative Tasks, With Kld of 0.0100 and 15/100 Refusals! — score 50 Sources: reddit/r/LocalLLaMA

Provided in both Safetensors and GGUFs. Safetensors: llmfan46/G4-MeroMero-31B-uncensored-heretic: https://huggingface.co/llmfan46/G4-MeroMero-31B-uncensored-heretic GGUFs: llmfan46/G4-MeroMero-31B-uncensored-heretic-GGUF: [https:/

🟡 @xai: You can now use X Premium subscriptions in Hermes Agent, and Hermes Agent can now search X posts. https://x.ai/news/grok-hermes — score 50 Sources: twitter_rss

You can now use X Premium subscriptions in Hermes Agent, and Hermes Agent can now search X posts. https://x.ai/news/grok-hermes

Developer Tools

🟡 luongnv89/claude-howto — A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value. — score 64 Sources: github_trending

A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.

🟡 n8n-io/n8n — Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations. — score 61 Sources: github_trending

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

Other Signals

🟡 Program misleading high school students into paying to perform academic misconduct in ML Research [D] — score 69 Sources: reddit/r/MachineLearning

I was browsing OpenReview and I came accross this person called Kevin Zhu https://openreview.net/profile?id=~Kevin_Zhu3, lets say I was impressed when I saw 158 publications and 468 coauthors, and out of curiosity I searched up his afflication ([htt

🟡 Corsair desktop PC with Ryzen 395 and 128GB of unified RAM, has anyone tested it for LLM? Seems "a good" price — score 63 Sources: reddit/r/LocalLLaMA

🟡 Testing llama.cpp MTP support on Qwen3.6 - RTX 5090 — score 57 Sources: reddit/r/LocalLLaMA

Setup: - RTX 5090, 32 GB, Linux - Built llama.cpp from 4f13cb7 (the official ghcr.io/ggml-org/llama.cpp:server-cuda image hasn't picked up the merge yet as of writing — had to docker build from source with CUDA_DOCKER_ARCH=120) - Unsloth's Qwen3.

🟡 Do you agree with Judea that learning from data is not everything? [D] — score 56 Sources: reddit/r/MachineLearning

Link: Judea Pearl, 2011 ACM Turing Award Recipient (2:18:05) Quote: >There is a limitation to that which people not everybody understand. I already mentioned a limitation that you have a hierarchy here and going from correlation to ca

🟡 Google literally dropped the new SEO playbook for AI — score 56 Sources: reddit/r/AIAgents

so google just published a long piece on how to optimize your site for their generative AI features (AI overviews, AI mode, all of it) this is basically the new SEO playbook straight from the source they break down how the AI search stuff actually works... what kind of content gets pulled into the A

Omitted 1 additional other signals items from the main section; see raw data and source-specific sections below.

🟢 Incremental

Model Releases

🟢 webui: support video files as input by foldl · Pull Request #22830 · ggml-org/llama.cpp — score 10 Sources: reddit/r/LocalLLaMA

now you can talk about videos

🟢 OpenAI and Government of Malta partner to roll out ChatGPT Plus to all citizens — score 10 Sources: hackernews

🟢 Looking to migrate off of Ollama and LMStudio — score 3 Sources: reddit/r/LocalLLaMA

Hello, I'm currently using Ollama / lm studio for things like code inference and proof reading emails, etc. Definitely not experienced in this space but looking to grow. It's been working great but it's a bit slow at times. I use Gemma 4 / Qwen, I also recently tried using OpenbioLLM 70B for some he

Developer Tools

🟢 octos-org/octos — Octos - Agentic Operating Systems — score 34 Sources: github_trending

Octos - Agentic Operating Systems

🟢 Jackrong/Qwopus3.5-9B-Coder-GGUF · Hugging Face — score 33 Sources: reddit/r/LocalLLaMA

Qwopus3.5-9B-coder is specially optimized and fine-tuned for high-performance 🤖 Agentic Coding, complex Tool Calling, and logical reasoning. >💡 Why the 9B Dense Model? We believe that the 9B dense architecture represents the perfect "sweet spot" *for large language mod

🟢 tech-leads-club/agent-skills — The secure, validated skill registry for professional AI coding agents. Extend Antigravity, Claude Code, Cursor, Copilot and more with absolute confidence. — score 32 Sources: github_trending

The secure, validated skill registry for professional AI coding agents. Extend Antigravity, Claude Code, Cursor, Copilot and more with absolute confidence.

🟢 chaitin/MonkeyCode — AI 开发平台，内置云端开发环境，并支持业内最全的顶尖大模型。无论是开发项目、做调研、写文档，还是分析数据、处理任务，打开浏览器就能随时开始，让 AI 持续帮你推进工作 — score 30 Sources: github_trending

AI 开发平台，内置云端开发环境，并支持业内最全的顶尖大模型。无论是开发项目、做调研、写文档，还是分析数据、处理任务，打开浏览器就能随时开始，让 AI 持续帮你推进工作

🟢 What if local AI agents remembered the system, not just the conversation? — score 28 Sources: reddit/r/AIAgents

Title: Since OverCR v1, I Turned It Into a Full AI Orchestration Substrate A little over a week ago I posted OverCR v1. At the time it was mostly the foundation: persistent filesystem-native state, governed orchestration concepts, workflow routing, recovery primitives, and the beginnings of a long-l

Omitted 6 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

🟢 NVIDIA/warp — A Python framework for GPU-accelerated simulation, robotics, and machine learning. — score 10 Sources: github_trending

A Python framework for GPU-accelerated simulation, robotics, and machine learning.

Business & Funding

🟢 Command code - anyone using it? — score 6 Sources: reddit/r/AIAgents

hey hey! I just saw command code have a $ 1 plan, especially to be used with DeepSeek V4 Pro. Did you guys already checked it? Sounds too good to me true lol https://commandcode.ai/

Enterprise Adoption

🟢 Deepseek V4's 1M context window: the breaking point — score 23 Sources: reddit/r/LocalLLaMA

Just ran to verify deepseek v4's context claim of 1M and ran it across three production codebases like 45k (microservice), 180k (monorepo backend) and 520k(full stack app). For the observation, tasks included dependency tracing, cross file refractors and bug isolation to see where recall keeps up **

Other Signals

🟢 Llama.cpp MTP with Qwen3.6 27B on Headless RTX 3090 — score 33 Sources: reddit/r/LocalLLaMA

Saw some posts around PP being slower, so they were cautious on trying it. Here's a real-world datapoint. Settings: * Headless RTX 3090 24G * OpenCode * Model unsloth's Qwen3.6-27B-MTP-Q4_K_M.gguf * 128k context * q8_0 kv cache * --spec-draft-n-max: 3 * --draft-p-min: 0 Use Cases: * Res

🟢 Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers — score 23 Sources: reddit/r/LocalLLaMA

I kept seeing inference-speed claims for these models and wanting an apples-to-apples comparison on the hardware I actually have. So I built a harness and a public page that dumps every run as YAML. The dataset: 55 runs, three rigs, five backends (rocm, vulkan, cpu, cuda, vllm-cuda), models from 0.3

🟢 Help with CNNs.[D] — score 6 Sources: reddit/r/MachineLearning

So, I’ve learned CNNs theoretically, but now I want to see how they behave practically , specifically on images: where they work well, where they fail, and how to improve their performance, etc. So, please suggest some resources or projects through which I can explore this practically.

Repo	Description	Stars Today	Language
KeygraphHQ/shannon	Shannon Lite is an autonomous, white-box AI pentester for web applications and APIs. It analyzes your source code, identifies attack vectors, and executes real exploits to prove vulnerabilities before they reach production.	335	typescript
HKUDS/CLI-Anything	"CLI-Anything: Making ALL Software Agent-Native" -- CLI-Hub:https://clianything.cc/	333	python
dograh-hq/dograh	Open Source Voice Agent Platform	287	python
luongnv89/claude-howto	A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.	165	python
n8n-io/n8n	Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.	156	typescript
octos-org/octos	Octos - Agentic Operating Systems	55	rust
tech-leads-club/agent-skills	The secure, validated skill registry for professional AI coding agents. Extend Antigravity, Claude Code, Cursor, Copilot and more with absolute confidence.	44	typescript
chaitin/MonkeyCode	AI 开发平台，内置云端开发环境，并支持业内最全的顶尖大模型。无论是开发项目、做调研、写文档，还是分析数据、处理任务，打开浏览器就能随时开始，让 AI 持续帮你推进工作	38	typescript
ilysenko/codex-desktop-linux	Run OpenAI Codex Desktop on Linux - automated installer	37	rust
getsentry/XcodeBuildMCP	A Model Context Protocol (MCP) server and CLI that provides tools for agent use when working on iOS and macOS projects.	35	typescript

🐦 Twitter/X Highlights

Account	Tweet Summary
xai	You can now use X Premium subscriptions in Hermes Agent, and Hermes Agent can now search X posts. https://x.ai/news/grok-hermes Post

Repeated From Recent Briefings

tinyhumansai/openhuman — Your Personal AI super intelligence. Private, Simple and extremely powerful. - first seen 2026-05-11
Long Context Pre-Training with Lighthouse Attention - first seen 2026-05-08
arXiv implements 1-year ban for papers containing incontrovertible evidence of unchecked LLM-generated errors, such as hallucinated references or results. [N] - first seen 2026-05-15
anthropics/skills — Public repository for Agent Skills - first seen 2026-05-11
K-Dense-AI/scientific-agent-skills — A set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing. - first seen 2026-05-14
I believe there are entire companies right now under AI psychosis - first seen 2026-05-16
anomalyco/opencode — The open source coding agent. - first seen 2026-05-09
colbymchenry/codegraph — Pre-indexed code knowledge graph for Claude Code — fewer tokens, fewer tool calls, 100% local - first seen 2026-05-09
rtk-ai/rtk — CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies - first seen 2026-05-05
Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding - first seen 2026-05-11
... plus 27 more repeated items in processed data

AI Watchtower Briefing — 2026-05-17

🔴 High Significance

Model Releases

Developer Tools

Other Signals

TL;DR All models were Qwen3.6 27B-MTP vs Base 27B (15k single-turn): Faster overall * Total Time (wall): 87.44s → 77.39s (10.05s faster / -11.50%) * Generation: 7.63 → 16.15 t/s (+111.77% speedup) * Prompt Processing: 279.75 → 244.90 t/s (-12.46% slowdown) **35B-MTP vs Ba

🟡 Notable

Model Releases

Developer Tools

Other Signals

🟢 Incremental

Model Releases

Developer Tools

Infrastructure & Compute

Business & Funding

Enterprise Adoption

Other Signals

🐦 Twitter/X Highlights

Repeated From Recent Briefings

AI Watchtower Briefing — 2026-05-17

🔴 High Significance

Model Releases

Developer Tools

Other Signals

TL;DR All models were Qwen3.6 27B-MTP vs Base 27B (15k single-turn): Faster overall * Total Time (wall): 87.44s → 77.39s (10.05s faster / -11.50%) * Generation: 7.63 → 16.15 t/s (+111.77% speedup) * Prompt Processing: 279.75 → 244.90 t/s (-12.46% slowdown) **35B-MTP vs Ba

🟡 Notable

Model Releases

Developer Tools

Other Signals

🟢 Incremental

Model Releases

Developer Tools

Infrastructure & Compute

Business & Funding

Enterprise Adoption

Other Signals

📈 Trending Repos

🐦 Twitter/X Highlights

Repeated From Recent Briefings