AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 PapersWithCode new features - week 1 [P] — score 94 Sources: reddit/r/MachineLearning

Hi, Niels here from the open-source team at Hugging Face. It's been one week since I launched paperswithcode.co, a revival of the website we all loved. It allows us to

🔴 Qwen3.6-35B-A3B vs Gemma4-26B-A4B — score 83 Sources: reddit/r/LocalLLaMA

Just wondering how are people's experience with both these models! I've had some nice results with Qwen but Gemma4 runs so much faster here. I'm using a Radeon 9070 XT and always latest llama.cpp.

🔴 1000 tps generation on Qwen3.6 27B with V100s — score 77 Sources: reddit/r/LocalLLaMA

I wanted to see what the absolute best case scenario for generation on this setup was and was not disappointed. 128 concurrent requests is so far removed from what I need but it’s funny to see big number. For single user (batch 1 not 128) the generation is around 80t/s with 3000 t/s processing,no mt

Developer Tools

🔴 I’m a solo dev building TigrimOSR, a Rust-native AI agent workspace for engineering and developer workflows. — score 83 Sources: reddit/r/AIAgents

Repo: https://github.com/Sompote/TigrimOSR The main problem I’m trying to solve is that agentic AI is still too random for serious engineering decisions. For design work, calculations, reports, code changes, or technical review, I don’t want agents just “vibing” through tasks. I want a more solid wo

🔴 DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost — score 83 Sources: hackernews

🔴 anthropics/knowledge-work-plugins — Open source repository of plugins primarily intended for knowledge workers to use in Claude Cowork — score 82 Sources: github_trending

Open source repository of plugins primarily intended for knowledge workers to use in Claude Cowork

🔴 How do ML practitioners select hyperparameters, architectures, etc for self-supervised representation learning when the loss is non-monotonic? [D] — score 81 Sources: reddit/r/MachineLearning

Non-contrastive SSL methods like BYOL/JEPA/data2vec seem promising, but I have no idea what is being learned, or how well; it’s models all the way down. Maybe I’ve got supervised tasks for which I’d like to see transfer, and I can evaluate linear probe/KNN results during training, but that seems lik

🔴 presenton/presenton — Open-Source AI Presentation Generator and API (Gamma, Beautiful AI, Decktopus Alternative) — score 78 Sources: github_trending

Open-Source AI Presentation Generator and API (Gamma, Beautiful AI, Decktopus Alternative)

Infrastructure & Compute

🔴 Is NVIDIA still the default best choice for local LLMs in 2026? — score 97 Sources: reddit/r/LocalLLaMA

Research Papers

🔴 SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research — score 78 Sources: huggingface · arxiv/cs.AI

The exponential growth of global academic output has confronted researchers and AI agents with an unprecedented ``information explosion,'' where fragmented and unstructured knowledge organization impedes deep interdisciplinary integration. Current academic retrieval tools predominantly rely on super

🔴 See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding — score 75 Sources: huggingface

We present SWIM (See What I Mean), a novel training strategy that aligns vision and language representations to enable fine-grained object understanding solely from textual prompts. Unlike existing approaches that require explicit visual prompts, such as masks or points, SWIM leverages mask supervis

Other Signals

🔴 server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp — score 70 Sources: reddit/r/LocalLLaMA

Imagine you are using a local model for agentic coding. You discuss the idea (50k tokens), then say “implement it”. The agent reads files, writes files, runs commands, produces another 20k tokens and the code is ready. Then your next prompt is just “thank you”, and... nothing happens, you have to wa

🟡 Notable

Model Releases

🟡 Chromeflow: agentic browser MCP that drives real Chrome with sessions intact (500+ hours hardening) — score 67 Sources: reddit/r/AIAgents

I've spent 500+ hours building Chromeflow, an MCP server + Chrome extension that lets agentic LLM systems drive the user's real Chrome browser instead of a fresh headless one. The motivating problem: agentic browser tools (Playwright, Browser Use, Puppeteer) launch a new browser process with empty c

🟡 hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) — score 57 Sources: reddit/r/LocalLLaMA

A few weeks ago, after finishing FastDMS, I started toying around writing some RDNA3 kernels again to see how fast I could get Qwen 3.6 MoE running. It turned out well enough, so over the past cou

🟡 What frontend do you guys use? — score 50 Sources: reddit/r/LocalLLaMA

I’m using vim lmao with a custom made plugin for completing text, so I was curious what yall use. Llama-server seems like a sensible default but it seems limited

🟡 I built a mini business consultant using Claude and Lovable as a weekend project, — score 47 Sources: reddit/r/AIAgents

I've been in business for 22 years. COO and CEO across ecommerce, a training company, and real estate. I write about operations and systems, I've been on podcasts, I have a YouTube channel, I run a newsletter with 600+ founders on it. And for the past few weeks I've been spending my spare time talki

🟡 MiMo-V2.5-coder — score 40 Sources: reddit/r/LocalLLaMA

Hi, I've just released MiMo-V2.5-coder. If you have 128 Gb, this is an excellent alternative to Qwen3.6 and DS4, especially for coding. Fast, and with reliable tool calling. Give it a try!

Developer Tools

🟡 BitCPM-CANN: Native 1.58-Bit Large Language Model Training on Ascend NPU — score 63 Sources: reddit/r/LocalLLaMA

Paper: https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf ### Abstract >We present BitCPM-CANN, a systematic family-level study of 1.58-bit (ternary) quantization-aware training (QAT) on the Huawei Ascend NPU platform. To address two practical gaps for extreme low-bit LLMs—whether

🟡 twentyhq/twenty — The open alternative to Salesforce, designed for AI. — score 58 Sources: github_trending

The open alternative to Salesforce, designed for AI.

🟡 MergeNB: An intuitive merge conflict resolver built for Jupyter notebooks in VS Code [P] — score 56 Sources: reddit/r/MachineLearning

I used to work heavily with Jupyter Notebooks + git + VS Code in a collaborative research setting and found nbdime to be somewhat buggy/a hassle to work with in general. So, in typical side project fashion (relevant xkcd) I've been working on MergeNB quite a bit over the la

🟡 gitroomhq/postiz-app — 📨 The ultimate agentic social media scheduling tool 🤖 — score 50 Sources: github_trending

📨 The ultimate agentic social media scheduling tool 🤖

🟡 superset-sh/superset — Code Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine — score 42 Sources: github_trending

Code Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine

Omitted 1 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

🟡 Memory has grown to nearly two-thirds of AI chip component costs — score 50 Sources: hackernews

🟡 If you use NVIDIA Isaac Sim for reinforcement learning, do you use Isaac Lab with it? Just want to get a sense of what the status quo is. [D] — score 44 Sources: reddit/r/MachineLearning

The reason for this query is that I am in the process of shifting to Isaac Sim / Isaac Lab since that is what seems to be in use nowadays. However, Isaac Lab is proving to be somewhat difficult to handle. While it handles the logging, and the creation of multi-actor systems for algorithms like PPO b

Research Papers

🟡 PhotoFlow: Agentic 3D Virtual Photography Missions — score 68 Sources: huggingface · arxiv/cs.AI

Virtual photography asks an agent to enter a prepared 3D scene with no preselected camera pose or reference image, infer a suitable shot from scene information and a language intent, choose executable camera parameters, and render the final photograph. Recent progress in vision-language models makes

🟡 RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution — score 55 Sources: huggingface

Discrete autoregressive (AR) text-to-image (T2I) models pair a VQ tokenizer with an AR policy, and current post-training pipelines optimize only the policy while keeping the VQ decoder frozen. Recent diffusion T2I work, exemplified by REPA-E, has shown that the VAE itself constitutes a key alignment

🟡 SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models — score 45 Sources: huggingface

Interactive world models for first-person shooter (FPS) games must resolve high-frequency overlapping control signals at every frame without disrupting unaffected regions. Existing methods inject actions globally and train on single titles, failing under dense FPS inputs. We observe that FPS actions

🟡 Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers — score 42 Sources: huggingface · arxiv/cs.AI

Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner. However, their computational cost grows quadratically with the input sequence length due to the global attention layers insi

🟢 Incremental

Model Releases

🟢 I will not promote: After Two Months of RELAUNCHING I HAVE HIT $85 MRR — score 39 Sources: reddit/r/AIAgents

Let me tell you what worked for me, Just some points that helped me grow. I have tried both Reddit and TikTok, wouldn't really recommend them tbh Reddit is good for posts like these for example. but promoting is not the place. Many are just hateful and will just spam scam scary potential customer aw

🟢 I built an AI coding agent that integrates more than 22 AI providers into a single, native plug-and-play environment — score 0 Sources: reddit/r/AIAgents

Hey, if you’re done dealing with Anthropic billing issues and usage limits, and you want an easy plug-and-play AI agent that fits your needs with free credits, cheap models, and multiple AI providers all in one ecosystem, Tau is your solution; it supports Claude Code skills, agent plugins, and MCP s

Developer Tools

🟢 What are the best AI outbound calling agents in 2026? — score 39 Sources: reddit/r/AIAgents

I’m looking at AI outbound calling agents for lead follow-up, appointment reminders, and basic sales outreach. Has anyone tested the major options in real workflows? I’ve seen tools like LuMay Voice Agent, Bland, Retell, and also Vapi mentioned in a few places. What actually matters most for out

🟢 Aider-AI/aider — aider is AI pair programming in your terminal — score 37 Sources: github_trending

aider is AI pair programming in your terminal

🟢 triggerdotdev/trigger.dev — Trigger.dev – build and deploy fully‑managed AI agents and workflows — score 27 Sources: github_trending

Trigger.dev – build and deploy fully‑managed AI agents and workflows

🟢 opensource music reccomendation / playlist, similar to spotify radio / YT music mix? — score 23 Sources: reddit/r/LocalLLaMA

Any recommendations for this? Initially, i was thinking that LLMs probably not the right thing for this (assuming your source data is all listening metrics), HOWEVER, if you combine a) user listening data; AND b) user comments / text data / reccs/ reviews / forum posts / social media mentions etc an

🟢 katanemo/plano — Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic. — score 22 Sources: github_trending

Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.

Omitted 2 additional developer tools items from the main section; see raw data and source-specific sections below.

Research Papers

🟢 Geo-Align: Video Generation Alignment via Metric Geometry Reward — score 30 Sources: huggingface

Camera-controlled video generation has achieved remarkable progress in recent years. However, existing video-to-video re-rendering methods primarily rely on Supervised Fine-Tuning using synthetic datasets. At present, there is an extreme scarcity of synchronized, multi-view real-world video data. Co

Other Signals

🟢 qwen3.6-35b-a3b-mtp running on GTX 1060 6GB — score 30 Sources: reddit/r/LocalLLaMA

I have this old 10-year old Dell T5810 workstation with 32GB ddr3(?) memory and a E5-2698v3 (16 cores 32 threads), a GTX 1060 6GB that's used for mining back in the old days (paid itself back many times over). I managed to get the model running with LMStudio in Windows(!). My settings are: Model: un

🟢 Working on a cgo-free CUDA binding in Go for ML stuff Week 3 - open source [P] — score 19 Sources: reddit/r/MachineLearning

At our work we use CUDA in Rust since the company switched to it recently. Rust has pretty good Driver API bindings but it made me wonder why the hell we cant have something decent in Go without cgo. I mostly build ML tools in the last month and Go is my main language for pretty much everything. Pro

🟢 "AI solved one of math's greatest challenges, but it cannot add two numbers reliably?!" [D] — score 19 Sources: reddit/r/MachineLearning

Suppose your friend, a mathematician, woke up from a 5-year coma. How would you explain this to him? Do we even have an explanation other than "it is what it is"? EDIT: Lots of comments that are basically paraphrasing "it is what it is", or denying the discrepancy exists. Wrong subreddit, I guess.

🟢 Qwen 3.6 benchmarks on 2x RTX PRO 6000 — score 17 Sources: reddit/r/LocalLLaMA

Got a chance to play around with 2x RTX PRO 6000 setup so sharing some number for Qwen 3.6. All these were run using latest stable VLLM backend. This was for a personal project. Qwen 3.6 27B BF16 (Original without any quantization) ------ MTP - Off | 64 concurrency | 1600 tps generation MTP - 2 | 3

🟢 I made a local-first MCP tutorial repo with node-llama-cpp and a custom agent loop — score 3 Sources: reddit/r/LocalLLaMA

I just published a repo called MCP from Scratch that teaches the Model Context Protocol by building it step by step in plain Node.js. Most of the repo is about understanding MCP itself, but the later modules may be relevant here: I added a local-first setup using node-llama-cpp, GGUF models, MCP s

Repo	Description	Stars Today	Language
anthropics/knowledge-work-plugins	Open source repository of plugins primarily intended for knowledge workers to use in Claude Cowork	550	python
presenton/presenton	Open-Source AI Presentation Generator and API (Gamma, Beautiful AI, Decktopus Alternative)	522	typescript
twentyhq/twenty	The open alternative to Salesforce, designed for AI.	154	typescript
gitroomhq/postiz-app	📨 The ultimate agentic social media scheduling tool 🤖	96	typescript
superset-sh/superset	Code Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine	60	typescript
Aider-AI/aider	aider is AI pair programming in your terminal	33	python
triggerdotdev/trigger.dev	Trigger.dev – build and deploy fully‑managed AI agents and workflows	26	typescript
katanemo/plano	Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.	19	rust

📄 New Papers

Title	Category	Hotness	Link
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research	research_paper	28	Open
See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding	research_paper	26	Open
PhotoFlow: Agentic 3D Virtual Photography Missions	research_paper	20	Open
RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution	research_paper	15	Open
BOHM: Zero-Cost Hierarchical Attribution for Compound AI Systems	cs.AI	0	Open
NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic	cs.AI	0	Open
RMA: an Agentic System for Research-Level Mathematical Problems	cs.AI	0	Open
Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems	cs.AI	0	Open
ImProver 2: Iteratively Self-Improving LMs for Neurosymbolic Proof Optimization	cs.AI	0	Open
Mediative Fuzzy Logic: From Type-1 Foundations to Type-2, Type-3 and Quantum Extensions	cs.AI	0	Open
EVE-Agent: Evidence-Verifiable Self-Evolving Agents	cs.AI	0	Open
The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems	cs.AI	0	Open
PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning	cs.AI	0	Open
Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems	cs.AI	0	Open
Redrawing the AI Map: A Theory of Accountability Boundaries in Agentic Ecosystems	cs.AI	0	Open

Repeated From Recent Briefings

Lum1104/Understand-Anything — Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini CLI, and more. - first seen 2026-05-21
colbymchenry/codegraph — Pre-indexed code knowledge graph for Claude Code, Codex, Cursor, OpenCode, and Hermes Agent — fewer tokens, fewer tool calls, 100% local - first seen 2026-05-09
Rethinking Cross-Layer Information Routing in Diffusion Transformers - first seen 2026-05-22
rohitg00/ai-engineering-from-scratch — Learn it. Build it. Ship it for others. - first seen 2026-05-21
Built a swarm-style multi-agent system in n8n with Telegram as the entry point - first seen 2026-05-24
anthropics/claude-plugins-official — Official, Anthropic-managed directory of high quality Claude Code Plugins. - first seen 2026-05-09
mukul975/Anthropic-Cybersecurity-Skills — 754 structured cybersecurity skills for AI agents · Mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND & NIST AI RMF · agentskills.io standard · Works with Claude Code, GitHub Copilot, Codex CLI, Cursor, Gemini CLI & 20+ platforms · 26 security domains · Apache 2.0 - first seen 2026-05-24
Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP - first seen 2026-05-24
farion1231/cc-switch — A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io - first seen 2026-05-08
multica-ai/multica — The open-source managed agents platform. Turn coding agents into real teammates — assign tasks, track progress, compound skills. - first seen 2026-05-24
... plus 105 more repeated items in processed data

AI Watchtower Briefing — 2026-05-25

🔴 High Significance

Model Releases

Developer Tools

Infrastructure & Compute

Research Papers

Other Signals

🟡 Notable

Model Releases

Developer Tools

Infrastructure & Compute

Research Papers

🟢 Incremental

Model Releases

Developer Tools

Research Papers

Other Signals

📈 Trending Repos

📄 New Papers

Repeated From Recent Briefings