AW · AI Watchtower

🔴 High Significance

Model Releases

🔴 Donate your coding sessions to an open CC-BY-4.0 dataset to help train open-weight and open source models — score 97 Sources: reddit/r/LocalLLaMA

Anthropic and Open AI are getting so much data from the Claude Code and Codex usage, and I'm quite scared this will create an oligopoly because only their models will be trained on it, leaving the open-weight and open source models behind. So I'm trying to launch a little initiative called Trace Com

🔴 [ECCV 2026] Final Decisions [D] — score 81 Sources: reddit/r/MachineLearning

ECCV 2026 final decisions are expected to be released on June 17, 2026. Since there was no exact release time specified, results will likely roll out within 48 hours. This thread is for everyone to share updates, discuss outcomes, and support each other through the decisions. Good luck to everyo

🔴 Mistral - New family of open-weight models @ July — score 70 Sources: reddit/r/LocalLLaMA

Tweet : https://xcancel.com/arthurmensch/status/2066913353860018596#m

Developer Tools

🔴 What guardrails are you using around agent tool calls? — score 72 Sources: reddit/r/AIAgents

For people building agents that actually call tools/APIs: where are you putting the execution boundary? The framing I keep coming back to is: treat the LLM as an untrusted planner. The risky part is not only that the model says something wrong. It is what happens when that plan becomes a real tool c

Research Papers

🔴 ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining — score 85 Sources: huggingface

Vision-Language-Action (VLA) models benefit from large-scale and diverse embodied data, yet scaling robot trajectory collection is costly and labor-intensive. Recent advances show that large-scale egocentric human videos provide complementary real-world supervision in pretraining. However, joint tra

🔴 LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling — score 82 Sources: huggingface · arxiv/cs.AI

Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window atten

🔴 Learning from the Self-future: On-policy Self-distillation for dLLMs — score 75 Sources: huggingface

On-policy self-distillation (OPSD) has proven effective for post-training large language models (LLMs), yet its application to diffusion LLMs (dLLMs) remains unexplored. Existing OPSD methods are inherently autoregressive-centric. They inject privileged information via left-to-right prefix condition

Other Signals

🔴 GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available — score 90 Sources: reddit/r/LocalLLaMA

From Source: GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available. It also beats Gemini, making it a frontier-level model for a fraction of the cost. Open weights is back. This model is a game changer. Source: [Cline](https://x.co

🔴 zai-org/GLM-5.2 is here! — score 83 Sources: reddit/r/LocalLLaMA

🔴 Has AI already killed self-help nonfiction books? — score 83 Sources: hackernews

🔴 Hashicorp founder thinks local models "aren't good ENOUGH yet" — score 77 Sources: reddit/r/LocalLLaMA

Generally, respect him a lot, but this is a wrong take. More than 1 year ppl are doing alright using SLMs for coding; only vibecoders might struggle Link

🟡 Notable

Model Releases

🟡 GLM-5.2 is now 1st on Design Arena — ahead of the now unavailable Claude Fable 5. — score 63 Sources: reddit/r/LocalLLaMA

https://x.com/Designarena/status/2066940737011560652

🟡 The agent conversation everyone's skipping: sprawl, shadow agents, and the bill that's coming — score 61 Sources: reddit/r/AIAgents

Everyone's talking about AI agents. Very few are talking about agent sprawl. Over the past few weeks we've been comparing notes with people at a bunch of B2B companies rolling out agents across sales, marketing, prod, eng, support, you name it. The same patterns keep coming up: • Agents getting buil

🟡 GPT‑NL: a sovereign language model for the Netherlands — score 50 Sources: hackernews

🟡 @xai: Grok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations 🧵 http://grok.com/imagine — score 50 Sources: twitter_rss

Grok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations 🧵 http://grok.com/imagine

🟡 GLM 5.2 API is live, weights are on HF, and ollama has it already — score 43 Sources: reddit/r/LocalLLaMA

GLM 5.2 dropped on Friday locked behind the GLM Coding Plan. That was annoying if you just wanted to test it without subscribing to another IDE tier. Two hours ago today they opened the API and pushed weights to HuggingFace under MIT. Ollama already has it. So now you can actually run it locally or

Developer Tools

🟡 ParthJadhav/app-store-screenshots — end to end app store screenshot creation using AI — score 57 Sources: github_trending

end to end app store screenshot creation using AI

🟡 nocobase/nocobase — NocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability. — score 52 Sources: github_trending

NocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability.

🟡 Gitlawb/openclaude — runs anywhere. uses anything — score 47 Sources: github_trending

runs anywhere. uses anything

Infrastructure & Compute

🟡 microsoft/fara — Fara-7B: An Efficient Agentic Model for Computer Use — score 60 Sources: github_trending

Fara-7B: An Efficient Agentic Model for Computer Use

🟡 Next-Latent Prediction Transformers [R] — score 56 Sources: reddit/r/MachineLearning

Microsoft Research Preprint Next-token prediction is myopic. What if transformers learn to predict their own next latent state? Microsoft Research present **Next-Latent

🟡 What is Speculative Decoding? (trending on paperswithco.de) [R] — score 44 Sources: reddit/r/MachineLearning

A method that is currently trending on Papers with Code is Speculative Decoding. https://preview.redd.it/dm4nh4t71o7h1.png?width=3082&format=png&auto=webp&s=b6468668667d4bcfb6c9248d3af7fd09f21fe0da Speculative decoding is an inference optimization technique

Research Papers

🟡 Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion — score 65 Sources: huggingface

Pixel-space diffusion models are trained on full-bandwidth noisy images, yet the useful signal available to the denoiser is strongly frequency dependent. Under rectified-flow diffusion and natural-image power-law spectra, the per-band data-to-noise contour k^{*}(t) = (1-t)^{-2/α} separates a signal-

🟡 ProCUA-SFT Technical Report — score 52 Sources: huggingface · arxiv/cs.LG

Training computer-use agents (CUAs) -- models that interact with graphical desktops through screenshots and keyboard/mouse actions -- requires large-scale, diverse trajectory data collected in full desktop environments. The largest public resource, AgentNet (22.5K human trajectories), leads to negat

🟡 ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions — score 45 Sources: huggingface

Large language models perform increasingly well on standardized logical reasoning benchmarks, but whether this ability remains robust beyond English is unclear. We introduce ChLogic, an English--Chinese aligned benchmark that tests whether models preserve logical reasoning performance when the same

Other Signals

🟡 Is Le Gros Chaton opensource? — score 57 Sources: reddit/r/LocalLLaMA

so i keep hearing about le gros chaton, the upcoming mistral model that allegedly destroys claude mythos, gpt-5.5, my sleep schedule, and possibly the french economy. people say it has 1b context, self-improves in real time, writes perfect code, and only hallucinates in elegant parisian metaphors. t

🟡 GLM-5.2 (max) is currently the third best model available, across both open and proprietary. — score 50 Sources: reddit/r/LocalLLaMA

🟡 Unlocking UK house-building with AI-accelerated planning — score 50 Sources: lab_blog/DeepMind

UK government partners with Google DeepMind to build a new AI-powered prototype aimed at faster housing decisions.

🟢 Incremental

Model Releases

🟢 Sharing my DIY AI Memory Framework: Giving LLMs human-like memory (and slashing token costs by 90%) — score 28 Sources: reddit/r/AIAgents

Hi everyone, Like many of you, AI has become an absolute necessity in my daily life. Whether I’m working on game development, writing, or just brainstorming ideas, trying to do it without an LLM now feels like going back to the Stone Age. However, as I used these tools more and more, I kept hitting

🟢 Looking for an open-source/free alternative to Eigent for Word document template migration (Gemini API) — score 28 Sources: reddit/r/AIAgents

Hi everyone, I am looking for some architectural guidance on setting up a local AI agent workflow on my laptop for document editing and formatting automation. I am not an expert and have very basic knowledge about AI related stuff. I am kinda new to this thing. So I am here seeking expert's advise

🟢 It looks like Rio 3.5 397B could've simply been a semi-failed embezzling of funding — score 23 Sources: reddit/r/LocalLLaMA

Here is the chain of events: 1. The model training received funding of R$500K (about $100K USD). 2. The [initial model documentation](https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/blob/0e1bf540744675baac21d6be4

🟢 Cheapest way to run GLM 5.x locally that's not a unified memory system? — score 17 Sources: reddit/r/LocalLLaMA

This is primarily an exercise to determine the possible options, obscure as they might be, to run at least a 4bit quant (let's say roughly IQ4_XS). 1. Got a CPU only setup? Please share your experience. Sapphire Rapids ES 56core + DDR5 might be an option 2. Multi GPU setups with partial or complete

🟢 Qwen-Robot Suite: A Foundation Model Suite for Physical World Intelligence — score 17 Sources: hackernews

Omitted 1 additional model releases items from the main section; see raw data and source-specific sections below.

Developer Tools

🟢 looking for agents to spam my platform with food ads. Any help appreciated. — score 28 Sources: reddit/r/AIAgents

I have a platform I want to use as a marketplace. But, first, I need to test the speed and the way agents can buy and sell items on the platform. So, if you have an agent, and want to get some action, let me know.

🟢 Looking for best AI phone agent with CRM integration — score 28 Sources: reddit/r/AIAgents

I run a small service based business and we're constantly missing calls while out on jobs or after hours. When that happens, it often takes longer than it should to get back to potential customers. The biggest challenges are missed calls, slow response times, and having to manually update our CRM af

🟢 Mel AI just shared a demo of video-native AI characters that can talk, react, and respond to camera context in real time [N] — score 12 Sources: reddit/r/MachineLearning

Character AI, founded by former Google/LaMDA developers Noam Shazeer and Daniel De Freitas, proved that text-based character chat can work as a real entertainment category. But the next chapter might not be better text chat. It might be real-time video interaction. Mel AI recently shared a demo of A

🟢 tracel-ai/burn — Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability. — score 3 Sources: github_trending

Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.

Business & Funding

🟢 Zhipu surges 33% as Wall Street raises bets on China AI after Anthropic curbs — score 37 Sources: reddit/r/LocalLLaMA

Research Papers

🟢 Variable-Width Transformers — score 25 Sources: huggingface

Scaling model size, specifically depth and width, has driven significant progress in transformer-based language models. However, most architectures maintain a constant width across all layers, allocating a fixed parameter and computation budget evenly despite different layers potentially playing dis

🟢 Text-Vision Co-Instructed Image Editing — score 10 Sources: huggingface

Existing image editing methods can be generally categorized into textual instruction-based and visual prompt-based ones. Textual instructions are semantically expressive, but are limited by the coarse granularity of spatial control of the editing results. In contrast, visual prompts such as drag and

🟢 MotionVLA: Vision-Language-Action Model for Humanoid Motion — score 10 Sources: huggingface

Generating realistic humanoid motion from scene images and text involves both low-frequency pose semantics and high-frequency physical dynamics. However, many existing methods tokenize motion with a single shared codebook, forcing heterogeneous motion signals into the same quantization space. Our fr

Other Signals

🟢 VibeThinker-3B: what is this witchcraft? Killing it at MathQA like it has ~30B parameters — score 30 Sources: reddit/r/LocalLLaMA

🟢 I am testing for a eval harness system and need it to break — score 28 Sources: reddit/r/AIAgents

Long story short: I have extra cloud capacity right now to focus on one very specific use case: LLM as a Judge. I did some stress tests but it's not enough. Hope to get real world firehose if possible. It's API based and no cost. Anyone interested in getting LLM evals in large scale?

🟢 I built a leakage-clean verifier for robot manipulation, is this useful? Am I solving a non-problem? [D] — score 11 Sources: reddit/r/MachineLearning

Spent the last few weeks on a benchmark/harness that tries to answer one question honestly: did a robot arm actually do the demonstrated task, or did the success metric just get fooled? The setup: compile a human demo into an object-centric graph (what changed in the world: relations, contacts, ev

🟢 GLM-5.2: Built for Long-Horizon Tasks — score 3 Sources: reddit/r/LocalLLaMA

Repo	Description	Stars Today	Language
microsoft/fara	Fara-7B: An Efficient Agentic Model for Computer Use	97	python
ParthJadhav/app-store-screenshots	end to end app store screenshot creation using AI	94	typescript
nocobase/nocobase	NocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability.	87	typescript
Gitlawb/openclaude	runs anywhere. uses anything	74	typescript
tracel-ai/burn	Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.	4	rust

📄 New Papers

Title	Category	Hotness	Link
ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining	research_paper	35	Open
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling	research_paper	61	Open
Learning from the Self-future: On-policy Self-distillation for dLLMs	research_paper	18	Open
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion	research_paper	15	Open
ProCUA-SFT Technical Report	research_paper	4	Open
Beyond Parallel Sampling: Diverse Query Initialization for Agentic Search	cs.AI	0	Open
When Rules Learn: A Self-Evolving Agent for Legal Case Retrieval	cs.AI	0	Open
SkillChain-Gym: A Benchmark for Reskilling-Aware Production-Inventory Control under Disruptions	cs.AI	0	Open
Skill-Constrained Model Predictive Control for Resilient Manufacturing Supply Chains	cs.AI	0	Open
Nothing from Something: Can a Language Model Discover 0?	cs.AI	0	Open
Quantifying Consistency in LLM Logical Reasoning via Structural Uncertainty	cs.AI	0	Open
MemTrace: Probing What Final Accuracy Misses in Long-Term Memory	cs.AI	0	Open
SpeechDx: A Multi-Task Benchmark for Clinical Speech AI	cs.AI	0	Open
Distributed General-Purpose Agent Networks: Architecture, Key Mechanisms, and Prototypes	cs.AI	0	Open
Treatment Response Optimized Clinical Decision Support AI System via Digital Twin Simulation	cs.AI	0	Open

🏢 Lab Blog Posts

DeepMind: Unlocking UK house-building with AI-accelerated planning

🐦 Twitter/X Highlights

Account	Tweet Summary
xai	Grok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations 🧵 http://grok.com/imagine Post

Repeated From Recent Briefings

Panniantong/Agent-Reach — Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees. - first seen 2026-06-06
Egonex-AI/Understand-Anything — Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini CLI, and more. - first seen 2026-05-21
AI language models have favorite names, and we mapped them [R] - first seen 2026-06-02
We shipped a customer support agent and our "testing" was basically vibes. Here's what changed after the first real incident. - first seen 2026-06-16
farion1231/cc-switch — A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io - first seen 2026-05-08
rohitg00/ai-engineering-from-scratch — Learn it. Build it. Ship it for others. - first seen 2026-05-21
earendil-works/pi — AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI - first seen 2026-05-09
datawhalechina/hello-agents — 📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程 - first seen 2026-05-09
What do you think is the biggest unsolved problem in AI agents right now? - first seen 2026-06-16
karpathy/autoresearch — AI agents running research on single-GPU nanochat training automatically - first seen 2026-05-21
... plus 379 more repeated items in processed data

AI Watchtower Briefing — 2026-06-17

🔴 High Significance

Model Releases

Developer Tools

Research Papers

Other Signals

🟡 Notable

Model Releases

Developer Tools

Infrastructure & Compute

Research Papers

Other Signals

🟢 Incremental

Model Releases

Developer Tools

Business & Funding

Research Papers

Other Signals

📈 Trending Repos

📄 New Papers

🏢 Lab Blog Posts

🐦 Twitter/X Highlights

Repeated From Recent Briefings