AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 Stop wasting electricity — score 88 Sources: reddit/r/LocalLLaMA

Run on my rtx4090 llama.cpp params: llama-server -m ~/Projects/llm/models/Qwen3.6-27B-UD-Q4_K_XL.gguf --flash-attn on -ngl all -ctk q4_0 -ctv q4_0 -t 32 -c 262144 Power limit was set using sudo nvidia-smi -pl N On my observation, GPU constantly hitting power limit, so its safe to say that it actual

🔴 TabPFN-3 just released: a pre-trained tabular foundation model for up to 1M rows [R][N] — score 81 Sources: reddit/r/MachineLearning

TabPFN-3 was released today, the next iteration of the tabular foundation model, originally published in Nature. Quick recap for anyone new to TabPFN: TabPFN predicts on tabular data in a single forward pass - no training, no hyperparameter search, no tuning. Built on TabPFN-2.5 (Nov 2025) and TabPF

Developer Tools

🔴 Steam Recommender using similarity! (Undergraduate Student Project) [P] — score 94 Sources: reddit/r/MachineLearning

(DISCLAIMER: I accidentally deleted the last post on this subreddit my apologies if this is your second time seeing it) Last year I made a post about my steam recommender The last one was great

🔴 Recommendation for agent stack for enterprise content generation — score 92 Sources: reddit/r/AIAgents

Any recommendation for agent stack for content generation? Looking for end to end agent flow for content generation, compliance and brand review, and persona variation. Currently testing out Writer AI but convinced we can do better for the company. Very interested in hearing what you have done, lear

🔴 Imbad0202/academic-research-skills — Academic Research Skills for Claude Code: research → write → review → revise → finalize — score 72 Sources: github_trending

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Infrastructure & Compute

🔴 I got a real transformer language model running locally on a stock Game Boy Color! — score 96 Sources: reddit/r/LocalLLaMA

No phone, PC, Wi-Fi, link cable, or cloud inference. • The cartridge boots a ROM, and the GBC runs the model itself. • The model is Andrej Karpathy’s TinyStories-260K, converted to INT8 weights with fixed-point math so it can run without floating point. • Built with GBDK-2020 as an MBC5 Game Boy ROM

Research Papers

🔴 Debiased Model-based Representations for Sample-efficient Continuous Control — score 82 Sources: huggingface · arxiv/cs.AI

Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training co

🔴 A Causal Language Modeling Detour Improves Encoder Continued Pretraining — score 78 Sources: huggingface · arxiv/cs.AI

When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed by a short MLM decay improves downstream performance. On biomedical texts with ModernBERT, this C

Other Signals

🔴 Needle: We Distilled Gemini Tool Calling Into a 26M Model — score 88 Sources: reddit/r/LocalLLaMA · hackernews

We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices. We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led t

🔴 Dad why is my sisters name Lora? — score 81 Sources: reddit/r/LocalLLaMA

🟡 Notable

Model Releases

🟡 Let's build claude code from scratch! — score 65 Sources: reddit/r/LocalLLaMA

So, I made this video about how to create claude code from scratch. Here's the video: https://youtu.be/8pDfgBEy8bg and Github: https://github.com/CohleM/nanoclaude Feedback is extremely appreciated.

🟡 Is using vLLM actually worth it if you aren't serving the model to other people? — score 50 Sources: reddit/r/LocalLLaMA

So, as most of us here are, I'm a llama.cpp loyalist. Easy to understand, great configuration, relatively stable, etc. But I’ve been increasingly tempted by vLLM, especially since AMD just added it as a built-in inference engine to Lemonade, and I happen to have an AMD GPU. The thing is, I've never

🟡 I think AI businesses are quietly shifting from hype products to operational tools — score 50 Sources: reddit/r/AIAgents

Last year felt dominated by flashy AI launches: * viral demos * AI agents doing everything * futuristic landing pages * endless “replace your team” messaging But the businesses I keep seeing actually retain users are much less exciting on the surface. They solve boring operational problems: * invoic

Developer Tools

🟡 BerriAI/litellm — Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM] — score 61 Sources: github_trending

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]

🟡 google-gemini/gemini-cli — An open-source AI agent that brings the power of Gemini directly into your terminal. — score 59 Sources: github_trending

An open-source AI agent that brings the power of Gemini directly into your terminal.

🟡 microsoft/data-formulator — 🪄 Create rich visualizations with AI — score 55 Sources: github_trending

🪄 Create rich visualizations with AI

🟡 EveryInc/compound-engineering-plugin — Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more — score 51 Sources: github_trending

Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more

🟡 A survey of every open-source "credential vault for AI agents — score 50 Sources: reddit/r/AIAgents

I went looking for a map of this space a few months ago, could not find one, and eventually wrote it myself. every week someone asks "is anything other than my .env file the right answer for an agent that runs unattended" and the honest reply is "yes, about a dozen things, and they disagree about wh

Omitted 5 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

🟡 How do you create memorable poster for top tier conferences ( ICML/ICLR/NEURips ect…) [D] — score 69 Sources: reddit/r/MachineLearning

Hello everyone, Presenting at a top-tier conference for the first time and having a very hard time coming up with an appropriate design for my poster. Everything I do seems basic and banal. My paper is more theory-oriented, and apart from putting math formulas in bold in the middle, I am not sure wh

🟡 My First Official AI Research Paper Accepted on SSRN — score 58 Sources: reddit/r/LocalLLaMA

https://preview.redd.it/oz4vpoxdfs0h1.jpg?width=910&format=pjpg&auto=webp&s=fa4c91aad0e3c56850fbfc06099e9c4095712bbd Today, my research paper “Stable Training with Adaptive Momentum (STAM)” was officially accepted on SSRN — marking my first documented and official publication as

Business & Funding

🟡 Save and invest your money for future rigs — score 42 Sources: reddit/r/LocalLLaMA

I have long had the itch to build out. Plan was for a 1tb genoa this year and what would have been a $6,000 affair is now a $30,000 affair. So I held out for the mac M5 studio ultra and then looks like Apple is gonna get hit with shortages, they have pushed from Q1 to Q2 and now Q3, meanwhile RTX bl

Research Papers

🟡 Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs — score 52 Sources: huggingface · arxiv/cs.CL

The continued improvements in language model capability have unlocked their widespread use as drivers of autonomous agents, for example in coding or computer use applications. However, the core of these systems has not changed much since early instruction-tuned models like ChatGPT. Even advanced AI

🟡 Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation — score 52 Sources: huggingface · arxiv/cs.LG

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular valu

🟡 WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting — score 52 Sources: huggingface · arxiv/cs.AI

Recent single-image relighting methods, powered by advanced generative models, have achieved impressive photorealism on synthetic benchmarks. However, their effectiveness in the complex visual landscape of the real world remains largely unverified. A critical gap exists, as current datasets are typi

Other Signals

🟡 Reimagining the mouse pointer for the AI era — score 50 Sources: hackernews

🟢 Incremental

Model Releases

🟢 I've seen a lot of folks ask "can local LLMs actually do anything useful?" — score 35 Sources: reddit/r/LocalLLaMA

And I'm here to share my experience. The answer is resoundingly 'yes'. Let me start with the local model I use every day in my AI harness: embedding models. I'm using an embedding model to give my AI's persistent memory system a semantic search protocol that makes its memory recall feel seamless to

🟢 OpenAI API now requires separate pay-as-you-go billing — what alternatives are you all using? — score 8 Sources: reddit/r/AIAgents

Hey everyone, Just wanted to flag something in case anyone else got caught off guard like I did. As of now, OpenAI's API is completely separate from any ChatGPT subscription. Even if you're paying $20/month for ChatGPT Plus, that gets you zero API access. They're treated as two entirely independent

🟢 Small local model for questions on German grammar — score 4 Sources: reddit/r/LocalLLaMA

I'm trying to learn German. I use Qwen3.5/3.6 locally, but this is pretty bad for German grammar. Has anyone got a recommendation for a small-ish local model that knows German grammer well and can answer questions on this?

Developer Tools

🟢 VoltAgent/voltagent — AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework — score 34 Sources: github_trending

AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework

🟢 FoundationAgents/MetaGPT — 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming — score 26 Sources: github_trending

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

🟢 zhangfengcdt/memoir — Hierarchical Agent Memory with Git-Like Version Control — score 24 Sources: github_trending

Hierarchical Agent Memory with Git-Like Version Control

🟢 ICML Visa issues [D] — score 19 Sources: reddit/r/MachineLearning

Has anyone applying for a Korean visa for ICML been asked for the conference’s Business Registration Number? The ICML website explicitly states that it cannot provide the BRC so I wanted to ask how others handled this

🟢 Show HN: Agentic interface for mainframes and COBOL — score 17 Sources: hackernews

Omitted 1 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

🟢 huggingface/transformers — 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. — score 38 Sources: github_trending

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Other Signals

🟢 I created a minimal one-file implementations (160loc) of JEPA family (ijepa, vjepa, vjepa2, cjepa) for educational purposes [P] — score 31 Sources: reddit/r/MachineLearning

Hi all, I made my own minimal implementation of JEPA algorithms. Making things minimal and removing all the things needed for scaling the algorithm always helped me understand the essence. So I stripped everything but the algorithm parts. What's left is 160-200 lines of code that distills the essenc

🟢 AntAngelMed - 100a6b Healthcare LLM — score 19 Sources: reddit/r/LocalLLaMA

🟢 How many of you tried BeeLlama.cpp? How's it? Agentic coding possible with 8GB VRAM? — score 12 Sources: reddit/r/LocalLLaMA

We'll be getting those features(check bottom link) on mainline soon or later anyway. But for now this fork could be useful to see the full potential of our poor GPUs(and also big, large GPUs). Any 8GB VRAM(and 32GB RAM) folks already doing Agentic coding with models(@ Q4 at least) like Qwen3.6-35B-A

Repo	Description	Stars Today	Language
Imbad0202/academic-research-skills	Academic Research Skills for Claude Code: research → write → review → revise → finalize	293	python
BerriAI/litellm	Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]	147	python
google-gemini/gemini-cli	An open-source AI agent that brings the power of Gemini directly into your terminal.	105	typescript
microsoft/data-formulator	🪄 Create rich visualizations with AI	89	typescript
EveryInc/compound-engineering-plugin	Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more	72	typescript
memvid/memvid	Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.	62	rust
huggingface/transformers	🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.	48	python
VoltAgent/voltagent	AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework	38	typescript
FoundationAgents/MetaGPT	🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming	34	python
zhangfengcdt/memoir	Hierarchical Agent Memory with Git-Like Version Control	33	python

📄 New Papers

Title	Category	Hotness	Link
Debiased Model-based Representations for Sample-efficient Continuous Control	research_paper	6	Open
A Causal Language Modeling Detour Improves Encoder Continued Pretraining	research_paper	5	Open
Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs	research_paper	2	Open
Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation	research_paper	2	Open
WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting	research_paper	2	Open
A Cascaded Generative Approach for e-Commerce Recommendations	cs.AI	0	Open
EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales	cs.AI	0	Open
RankQ: Offline-to-Online Reinforcement Learning via Self-Supervised Action Ranking	cs.AI	0	Open
OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents	cs.AI	0	Open
The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes	cs.AI	0	Open
Don't Look at the Numbers: Visual Anchoring Bias and Layer-wise Representation in VLMs	cs.AI	0	Open
Do Vision-Language-Models show human-like logical problem-solving capability in point and click puzzle games?	cs.AI	0	Open
PIVOT: Bridging Planning and Execution in LLM Agents via Trajectory Refinement	cs.AI	0	Open
Rethinking LLMOps for Fraud and AML: Building a Compliance-Grade LLM Serving Stack	cs.AI	0	Open
The Semantic Training Gap: Ontology-Grounded Tool Architectures for Industrial AI Agent Systems	cs.AI	0	Open

🏢 Lab Blog Posts

OpenAI: How finance teams use Codex

Repeated From Recent Briefings

NousResearch/hermes-agent — The agent that grows with you - first seen 2026-05-11
farion1231/cc-switch — A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io - first seen 2026-05-08
yikart/AiToEarn — Let's use AI to Earn! - first seen 2026-05-11
rohitg00/agentmemory — #1 Persistent memory for AI coding agents based on real-world benchmarks - first seen 2026-05-09
tinyhumansai/openhuman — Your Personal AI super intelligence. Private, Simple and extremely powerful. - first seen 2026-05-11
millionco/react-doctor — Your agent writes bad React. This catches it - first seen 2026-05-10
garrytan/gstack — Use Garry Tan's exact Claude Code setup: 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA - first seen 2026-05-12
bytedance/UI-TARS-desktop — The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra - first seen 2026-05-09
danielmiessler/Personal_AI_Infrastructure — Agentic AI Infrastructure for magnifying HUMAN capabilities. - first seen 2026-05-02
datawhalechina/hello-agents — 📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程 - first seen 2026-05-09
... plus 140 more repeated items in processed data

AI Watchtower Briefing — 2026-05-13

🔴 High Significance

Model Releases

Developer Tools

Infrastructure & Compute

Research Papers

Other Signals

🟡 Notable

Model Releases

Developer Tools

Infrastructure & Compute

Business & Funding

Research Papers

Other Signals

🟢 Incremental

Model Releases

Developer Tools

Infrastructure & Compute

Other Signals

📈 Trending Repos

📄 New Papers

🏢 Lab Blog Posts

Repeated From Recent Briefings