πŸ”΄ High Significance

Model Releases

πŸ”΄ Waiting for Qwen 3.7 open weight... The new King has arrived... β€” score 90 Sources: reddit/r/LocalLLaMA

The hype is real! https://qwen.ai/blog?id=qwen3.7

πŸ”΄ Qwen3.6 35Ba3 has changed my workflows and even how I use my computer β€” score 77 Sources: reddit/r/LocalLLaMA

My workflow has changed basically to ask Codex to do certain tasks and then document how to do them (including errors it found on its way) into a skill. I feed that skill to pi, and suddenly my qwen3.6 gets that hard stuff done: - devops on a VPS - using docling to create epubs from old PDFs - us

Developer Tools

πŸ”΄ Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D] β€” score 81 Sources: reddit/r/MachineLearning

The research community has provided (already for some time) seemingly more efficient and effective tokenizations for vision. Do we have any hint on whether non-fixed-patches tokenization is being applied on the big player models? I imagine not, and I'm trying to think why: - marginal gains? - pipe

πŸ”΄ A.I. Agents: They’re Fun They’re Useful But Don’t Give Them the Credit Card β€” score 81 Sources: reddit/r/AIAgents

Infrastructure & Compute

πŸ”΄ When your LLM treats data center GPUs like an optional DLC β€” score 70 Sources: reddit/r/LocalLLaMA

Research Papers

πŸ”΄ TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation β€” score 82 Sources: huggingface Β· arxiv/cs.CL

Public transit route planning traditionally depends on structured map infrastructure and complex routing engines, and no existing dataset supports training models to bypass this dependency. We present TransitLM, a large-scale dataset of over 13 million transit route planning records from four Chines

πŸ”΄ LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning β€” score 72 Sources: huggingface Β· arxiv/cs.CL

Joint audio-visual reasoning is essential for omnimodal understanding, yet current multimodal large language models (MLLMs) still struggle when reasoning requires fine-grained evidence from both modalities. A central limitation is that explicit text-based chain-of-thought (CoT) compresses continuous

Other Signals

πŸ”΄ [AMA] Got laid off 3 weeks ago. Instead of updating my resume I went down a rabbit hole. Here's what I found β€” score 94 Sources: reddit/r/AIAgents

I've been a software engineer for a few years. Worked in bigtech, good salary, stable job, the whole thing. Then my position got cut and I had one of those forced moments of clarity that I think a lot of people in tech are having right now. I could update my LinkedIn, apply to 200 jobs, and land som

πŸ”΄ Multi-Stream LLMs: new paper on parallelizing/separating prompts, thinking, I/O β€” score 88 Sources: hackernews

πŸ”΄ 110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp β€” score 83 Sources: reddit/r/LocalLLaMA

Had been getting great MTP performance with llama.cpp on my RTX 4070 Super 12GB, until they actually merged the MTP PR. Then, performance tanked and was bare

🟑 Notable

Model Releases

🟑 LatitudeGames/Equinox-31B Β· Hugging Face β€” score 63 Sources: reddit/r/LocalLLaMA

new model from LatitudeGames - Gemma 31B finetune https://huggingface.co/LatitudeGames/Equinox-31B-GGUF Equinox draws its name from the balance between extremes. Trained on a bal

🟑 Launch HN: Runtime (YC P26) – Sandboxed coding agents for everyone on a team β€” score 62 Sources: hackernews

🟑 For everyone that uses OpenCode / Pi - Heres your promptprocessing fix! β€” score 50 Sources: reddit/r/LocalLLaMA

This PR deserves much more attention as it fixes the constant promptprocessing that happens when using llama.cpp with Opencode or pi. https://github.com/ggml-org/llama.cpp/pull/22929

🟑 AdventHealth advances whole-person care with OpenAI β€” score 50 Sources: lab_blog/OpenAI

AdventHealth is using ChatGPT for Healthcare to streamline workflows, reduce administrative burden, and return more time to patient care.

🟑 We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks β€” score 50 Sources: lab_blog/DeepMind

Omitted 3 additional model releases items from the main section; see raw data and source-specific sections below.

Developer Tools

🟑 Novel Problems in VLA [R] β€” score 56 Sources: reddit/r/MachineLearning

I'm currently doing a research internship and my supervisor is constantly pushing me to have a novel idea, I've read about 15-20 papers about VLA and I think that most of the things are saturated, I thought about an equivariant VLA based on equivariant CNN which was published in 2016 and successfull

🟑 teng-lin/notebooklm-py β€” Unofficial Python API and agentic skill for Google NotebookLM. Full programmatic access to NotebookLM's featuresβ€”including capabilities the web UI doesn't exposeβ€”via Python, CLI, and AI agents like Claude Code, Codex, and OpenClaw. β€” score 50 Sources: github_trending

Unofficial Python API and agentic skill for Google NotebookLM. Full programmatic access to NotebookLM's featuresβ€”including capabilities the web UI doesn't exposeβ€”via Python, CLI, and AI agents like Claude Code, Codex, and OpenClaw.

🟑 In theory, if I have $20k-ish to spend on hardware what would actually get me closest to local coding agent that would allow me to go totally off the social grid? β€” score 43 Sources: reddit/r/LocalLLaMA

Let's say I'm in the market to buy a studio or RTX 6000's. At what point am I off the grid with a local coding agent? Probably a model question too.

Research Papers

🟑 Q-ARVD: Quantizing Autoregressive Video Diffusion Models β€” score 65 Sources: huggingface

Autoregressive video diffusion models (ARVDs) have emerged as a promising architecture for streaming video generation, paving the way for real-time interactive video generation and world modeling. Despite their potential, the substantial inference cost of ARVDs remains a major obstacle to practical

🟑 GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation β€” score 55 Sources: huggingface

Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that

🟑 More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts β€” score 42 Sources: huggingface Β· arxiv/cs.CL

Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained distinctions between neighboring values. We study when context and explicit moral knowledge help sentence-level value detection. Using the ValuesML/Touch{Γ©} ValueEval

Other Signals

🟑 Heretic has been served a legal notice by Meta, Inc. β€” score 67 Sources: reddit/r/LocalLLaMA

To Whomsoever it May Concern, The individual behind the Heretic Free Software Project (henceforth called "Heretic", notwithstanding unrelated entities of the same name) has been served a notice by a legal services provider representing Meta Platforms, Inc. (henceforth called "Meta"), via the digital

🟑 We're Thursday and no one claimed AGI yet this week! β€” score 57 Sources: reddit/r/LocalLLaMA

U guys okay?

🟒 Incremental

Model Releases

🟒 Which agent should I use for coding and notetaking? β€” score 38 Sources: reddit/r/AIAgents

Hey everyone, I'm a software developer with a strong interest in psychology and photography. I’m currently subscribed to ChatGPT mainly because I use it for coding with agents, but I keep running into the limits of the $20 plan. Because of that, I’m considering other options, including GitHub Copilo

🟒 New Release of ROCm based MLX LLM Engine - lemon-mlx-engine β€” score 23 Sources: reddit/r/LocalLLaMA

Hey everyone lemon-mlx-engine just got done integrating TheRock / ROCm 7.13 into the lemon-mlx-engine which means you get to try the latest ROCm on your local hardware with the MLX engine! This also includes various bug fixes and kernel fixes we have been seeing in Qwen3, 3.5 and 3.6 MoE and dense.

🟒 Low-level coding dataset β€” score 10 Sources: reddit/r/LocalLLaMA

Hi all, I've recently been thinking about putting together a community sourced coding dataset for finetuning models, with a heavy focus on cpp and systems programming. My goal is to eventually have a model (say a finetune of Qwen3.6-27b) that is good at stuff like memory ownership, thread safety, op

🟒 ztok β€” a fast multithreaded tokenizer in Zig that loads tiktoken / HF / SentencePiece and is 2–5Γ— faster β€” score 0 Sources: reddit/r/LocalLLaMA

I built ztok, a tokenizer library focused on being fast and format-agnostic for local pipelines. - Loads what you already have β€” .tiktoken, HF tokenizer.json, SentencePiece .model, TokenMonster, Mistral Tekken. Auto-detected. - Bit-identical to tiktoken / HF / SentencePiece on the equivalence gate

Developer Tools

🟒 If you've built an AI agent or chatbot - how do you know what users actually want from it? β€” score 38 Sources: reddit/r/AIAgents

Real question for anyone running an agent or chat product. When users just talk to your agent in natural language, you lose visibility into what they actually asked for, whether they got it, and what they kept wanting that your agent couldn't do. And when it quietly fails someone, there's no error

🟒 A customer asked why our agent believed something. We had no answer. β€” score 38 Sources: reddit/r/AIAgents

Prompt was clean. Model was fine. We went digging. No timestamp on the write. No source attached. No way to tell if it came from a user input, a tool call, a summarization pass, or something that got corrupted three updates ago. The memory layer had absorbed it at some point and that was the end of

🟒 Inline prompt-injection guards need to be fast enough for the agent hot path β€” score 38 Sources: reddit/r/AIAgents

I wrote up a benchmark note for Armorer Guard here: https://armorerlabs.com/blog/armorer-guard-inline-prompt-injection-defense The practical question I am trying to answer is not just "can we detect prompt injection?" It is: can a guard sit directly before an agent turns context into action without

🟒 Agent reliability is killing the one genius narrative β€” score 38 Sources: reddit/r/AIAgents

The line from Yao Shunyu's interview that stuck with me was not the spicy "AI does not need brains" bit. It was the claim that the individual hero era in AI is basically over. That sounds harsh, but it matches what the industry looks like now. Frontier model progress is less about one genius idea an

🟒 google-labs-code/stitch-skills β€” A library of Agent Skills designed to work with the Stitch MCP server. Each skill follows the Agent Skills open standard, for compatibility with coding agents such as Antigravity, Gemini CLI, Claude Code, Cursor. β€” score 31 Sources: github_trending

A library of Agent Skills designed to work with the Stitch MCP server. Each skill follows the Agent Skills open standard, for compatibility with coding agents such as Antigravity, Gemini CLI, Claude Code, Cursor.

Omitted 5 additional developer tools items from the main section; see raw data and source-specific sections below.

Business & Funding

🟒 One thing that's been bothering me lately: benchmark performance often tells me almost nothing about whether a workflow will survive production usage.[D] β€” score 31 Sources: reddit/r/MachineLearning

I've seen systems score well internally and then immediately fail under: * ambiguous user intent * messy real-world context * contradictory instructions * long-running sessions Feels like evaluation still heavily rewards clean-task optimization instead of behavioral robustness. What are people using

Enterprise Adoption

🟒 Nobody talks about what AI memory looks like after six months in production. β€” score 38 Sources: reddit/r/AIAgents

Old preferences keep winning retrieval, sarcastic comments get stored as literal truth, and summaries outlive the facts that made them true. You're not running a memory system at that point, you're babysitting one. Your AI context should not be a black box. It should be configurable, correctable, an

Research Papers

🟒 OmniPro: A Comprehensive Benchmark for Omni-Proactive Streaming Video Understanding β€” score 15 Sources: huggingface

Omni-proactive streaming video understanding, i.e., autonomously deciding when to speak and what to say from continuous audio-visual streams, is an emerging capability of omni-modal large language models. Existing benchmarks fall short in three key aspects: they rely primarily on visual signals, ado

Other Signals

🟒 Can liveness detection models generalise to synthetic media generation techniques they were never trained on? [D] β€” score 39 Sources: reddit/r/MachineLearning

Most liveness detection systems in production today were built around a threat model where the attacker is submitting a static image or a basic replay video. The generation quality of current synthetic media is categorically different from what those training datasets captured. The question I keep c

🟒 Latest b9274 Addresses MTP VRAM leak β€” score 37 Sources: reddit/r/LocalLLaMA

B9274 I have been having an issue with MTP models unloading after a couple minutes of use. Can't figure out why. Anyways z I don't think this is relevant to that but I did observe the vram creep so hopefully this helps. > server : free d

🟒 Anyone evaluated the difference between Qwen Code for the local qwen models vs another harness? CC, OC, LC, Aider etc.. β€” score 17 Sources: reddit/r/LocalLLaMA

For me, opencode doing fantastically but was wondering if qwen code would be more native and have better functionality, since idk which agentic harness they used to get their benchmark results

🟒 Lisbon Machine Learning School (LxMLS 2026) [D] β€” score 6 Sources: reddit/r/MachineLearning

Hi did anyone apply it, or attended it previously? How was the experience? I got the acceptance but no scholarship, is it worth going self sponsored?

RepoDescriptionStars TodayLanguage
teng-lin/notebooklm-pyUnofficial Python API and agentic skill for Google NotebookLM. Full programmatic access to NotebookLM's featuresβ€”including capabilities the web UI doesn't exposeβ€”via Python, CLI, and AI agents like Claude Code, Codex, and OpenClaw.186python
google-labs-code/stitch-skillsA library of Agent Skills designed to work with the Stitch MCP server. Each skill follows the Agent Skills open standard, for compatibility with coding agents such as Antigravity, Gemini CLI, Claude Code, Cursor.69typescript
software-mansion/argentAn agentic toolkit to control, debug, and profile iOS and Android apps. Made by Software Mansion.67typescript
ryoppippi/ccusageAnalyze coding (agent) CLI token usage and costs from local data.58rust
google/adk-samplesA collection of sample agents built with Agent Development Kit (ADK)33python

πŸ“„ New Papers

TitleCategoryHotnessLink
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generationresearch_paper106Open
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoningresearch_paper30Open
Q-ARVD: Quantizing Autoregressive Video Diffusion Modelsresearch_paper14Open
GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillationresearch_paper10Open
Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestrationcs.AI0Open
OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mindcs.AI0Open
AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflowscs.AI0Open
High Quality Embeddings for Horn Logic Reasoningcs.AI0Open
Open-World Evaluations for Measuring Frontier AI Capabilitiescs.AI0Open
Personality Engineering with AI Agents: A New Methodology for Negotiation Researchcs.AI0Open
From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)cs.AI0Open
COAgents: Multi-Agent Framework to Learn and Navigate Routing Problems Search Spacecs.AI0Open
Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelinescs.AI0Open
Declarative Data Services: Structured Agentic Discovery for Composing Data Systemscs.AI0Open
VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signalscs.AI0Open

🏒 Lab Blog Posts

🐦 Twitter/X Highlights

AccountTweet Summary
OpenAIHighlights from today’s Codex Thursday launches: 1️⃣ Codex can now securely use apps on your Mac from your phone, even when your Mac is locked and the screen is off. http://developers.openai.com/codex/app/computer-use#locked-use Post
xaiYou can now use your @grok or X Premium subscription in @opencode. Use the model powering Grok Build for high speed and codebase intelligence. https://x.ai/news/grok-opencode Post
simonwI released the first alpha of Datasette Agent - a conversational AI assistant for Datasette that can answer questions about data in SQLite databases, and can be extended with plugins to add extra tools and features Here's a demo Post

Repeated From Recent Briefings