๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด vLLM ROCm has been added to Lemonade as an experimental backend โ€” score 77 Sources: reddit/r/LocalLLaMA

vLLM has the ability to run .safetensors LLMs before they are converted to GGUF and represents a new engine to explore. I personally had never tried it out until u/krishna2910-amd/ u/mikkoph and u/sa1sr1 made it as easy as running llama.cpp in Lemonade: ` lemonade backends install vllm:rocm lemonade

๐Ÿ”ด A recent experience with ChatGPT 5.5 Pro โ€” score 75 Sources: hackernews

๐Ÿ”ด I'm kinda good at getting users for ai tools through reddit - could I make money? โ€” score 72 Sources: reddit/r/AIAgents

So I've made and launched my own ai tools and agents before, and ive helped some of my friends too. I learned multiple reddit post strategies a bit ago that, with the right tweaking usually gets me around 100+ organic users within a week or 2 for every project. My last project went crazy I made 2 un

Developer Tools

๐Ÿ”ด bytedance/UI-TARS-desktop โ€” The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra โ€” score 91 Sources: github_trending

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

๐Ÿ”ด datawhalechina/hello-agents โ€” ๐Ÿ“š ใ€ŠไปŽ้›ถๅผ€ๅง‹ๆž„ๅปบๆ™บ่ƒฝไฝ“ใ€‹โ€”โ€”ไปŽ้›ถๅผ€ๅง‹็š„ๆ™บ่ƒฝไฝ“ๅŽŸ็†ไธŽๅฎž่ทตๆ•™็จ‹ โ€” score 89 Sources: github_trending

๐Ÿ“š ใ€ŠไปŽ้›ถๅผ€ๅง‹ๆž„ๅปบๆ™บ่ƒฝไฝ“ใ€‹โ€”โ€”ไปŽ้›ถๅผ€ๅง‹็š„ๆ™บ่ƒฝไฝ“ๅŽŸ็†ไธŽๅฎž่ทตๆ•™็จ‹

๐Ÿ”ด Shel Silverstein predicts LLM's (and its hallucinations), cira 1981 โ€” score 86 Sources: reddit/r/LocalLLaMA

Ran across this cartoon / poem on accident as I was reminiscing about my favorite childhood poet, Shel Silverstein, and couldn't help thinking of LLM's of course!

๐Ÿ”ด earendil-works/pi โ€” AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods โ€” score 86 Sources: github_trending

AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods

๐Ÿ”ด anomalyco/opencode โ€” The open source coding agent. โ€” score 84 Sources: github_trending

The open source coding agent.

Omitted 3 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

๐Ÿ”ด Unpopular Opinion: The DGX Spark Forum community of devs is talented AF and will make the crippled hardware a success through their sheer force of will. โ€” score 95 Sources: reddit/r/LocalLLaMA

There is a lot of disdain for DGX Sparks here on the sub. And I get it. A lot of people say โ€œIt could have been great if it had been better memory bandwidthโ€, โ€œSM-121 is a fake /second-class Blackwell chipโ€ yadda, yadda. These criticisms are valid. I bought one anyway because Iโ€™m pursuing a Masters

Research Papers

๐Ÿ”ด Audio-Visual Intelligence in Large Foundation Models โ€” score 85 Sources: huggingface

Audio-Visual Intelligence (AVI) has emerged as a central frontier in artificial intelligence, bridging auditory and visual modalities to enable machines that can perceive, generate, and interact in the multimodal real world. In the era of large foundation models, joint modeling of audio and vision h

๐Ÿ”ด Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction โ€” score 82 Sources: huggingface ยท arxiv/cs.AI

Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This abstraction is efficient, but for agentic search, it becomes a bottleneck: exact lexical constraints, sparse cl

Other Signals

๐Ÿ”ด OpenAIโ€™s WebRTC problem โ€” score 92 Sources: hackernews

๐ŸŸก Notable

Model Releases

๐ŸŸก Teaching Claude Why โ€” score 58 Sources: hackernews

๐ŸŸก Reports suggest DeepSeek is seeking $7.35 billion in funding and plans to release its V4.1 update next month. โ€” score 50 Sources: reddit/r/LocalLLaMA

DeepSeek Reportedly Seeking to Raise Over RMB 50 Billion ($7.35 Billion), Accelerating Its Commercialization and Monetization Strategy According to two people familiar with the matter, DeepSeek founder and CEO Liang Wenfeng plans to contribute the maximum allowable amount in the companyโ€™s first fund

๐ŸŸก @OpenAI: Chain of thought monitors are a key layer of defense against AI agent misalignment. To preserve monitorability, we avoid penalizing misaligned reasoning during RL. We found a limited amount of accide โ€” score 50 Sources: twitter_rss

Chain of thought monitors are a key layer of defense against AI agent misalignment. To preserve monitorability, we avoid penalizing misaligned reasoning during RL. We found a limited amount of accidental CoT grading which affected released models, and are sharing our analysis. https://alignment.open

๐ŸŸก Using Claude Code: The unreasonable effectiveness of HTML โ€” score 42 Sources: hackernews

๐ŸŸก new MoE from ai2, EMO โ€” score 41 Sources: reddit/r/LocalLLaMA

new MoE release from ai2 - EMO, 1b-active/14b-total trained on 1t tokens interesting thing is document-level routing. experts cluster around domains like health, news, etc. instead of surface patterns models: [https://huggingface.co/collections/allenai/emo](https://huggingface.co/collections/allenai

Developer Tools

๐ŸŸก CopilotKit/CopilotKit โ€” The Frontend Stack for Agents & Generative UI. React + Angular. Makers of the AG-UI Protocol โ€” score 64 Sources: github_trending

The Frontend Stack for Agents & Generative UI. React + Angular. Makers of the AG-UI Protocol

๐ŸŸก vercel-labs/skills โ€” The open agent skills tool - npx skills โ€” score 61 Sources: github_trending

The open agent skills tool - npx skills

๐ŸŸก colbymchenry/codegraph โ€” Pre-indexed code knowledge graph for Claude Code โ€” fewer tokens, fewer tool calls, 100% local โ€” score 55 Sources: github_trending

Pre-indexed code knowledge graph for Claude Code โ€” fewer tokens, fewer tool calls, 100% local

๐ŸŸก ChromeDevTools/chrome-devtools-mcp โ€” Chrome DevTools for coding agents โ€” score 53 Sources: github_trending

Chrome DevTools for coding agents

๐ŸŸก PaddlePaddle/PaddleOCR โ€” Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages. โ€” score 51 Sources: github_trending

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Omitted 4 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

๐ŸŸก DeepSeek V4 paper full version is out, FP4 QAT details and stability tricks [D] โ€” score 56 Sources: reddit/r/MachineLearning

DeepSeek dropped the full V4 paper this week. preview from april was 58 pages, this version adds a lot of technical depth. What stood out for me. FP4 quantization aware training. theyre running FP4 QAT directly in late stage training. MoE expert weights quantized to FP4 (the main gpu memory consumer

Other Signals

๐ŸŸก Qwen 35B-A3B is very usable with 12GB of VRAM โ€” score 68 Sources: reddit/r/LocalLLaMA

Hardware: RTX 3060 12GB 32GB DDR4-3200 Windows CUDA 13.x Model: Qwen3.6-35B-A3B-MTP-IQ4_XS.gguf The model is a 35B MoE, so -ncmoe matters a lot. Lower -ncmoe means more MoE blocks stay on GPU. # Main takeaway 12GB VRAM feels like a very practical size for this model. It lets you keep enough

๐ŸŸก I built a platform to run AI employees and companies autonomously. โ€” score 52 Sources: reddit/r/AIAgents

๐ŸŸก NeurIPS reviewers, any word after the invite email? [D] โ€” score 44 Sources: reddit/r/MachineLearning

I got a NeurIPS reviewer invite last week, and accepted it. It said that bidding for papers will start may 8th (today). But havenโ€™t heard anything yet. Has anyone else heard anything? Did I mess up while accepting the reviewer invite or is this normal? P.s., thoughts on the AI-assisted reviewing exp

๐ŸŸข Incremental

Model Releases

๐ŸŸข Formalizing statistical learning theory in Lean 4 [R] โ€” score 31 Sources: reddit/r/MachineLearning

Iโ€™ve been working on a Lean 4 project focused on formalizing parts of statistical learning theory: FormalSLT repository Current results include: * finite-class ERM bounds * Rademacher symmetrization * high-probability Rademacher bounds

๐ŸŸข How long for llama.cpp official support of MTP? โ€” score 14 Sources: reddit/r/LocalLLaMA

Hello there (beginner here) I've been unable to build myself llama.cpp for my Strix Halo (Windows 11) (cmake errors, I have not digged too much into it, already burned hours...), so I was wondering when an official release for Vulkan/HIP with MTP support would be available? Thanks!

๐ŸŸข FEEDBACK FOR MY APP โ€” score 0 Sources: reddit/r/AIAgents

I built this app using Lovable as my first AI-powered project. Itโ€™s a fully functional messaging application with chat, voice calling, and video calling features, and everything is working smoothly. I also converted it into an APK using andriod studio f

Developer Tools

๐ŸŸข Code Reviewer can see everything and yet production keeps breaking โ€” score 39 Sources: reddit/r/AIAgents

Whatโ€™s interesting to me about AI code reviews isnโ€™t really the code generation part anymore. Itโ€™s the fact that review tools can now see almost everything inside a codebase, and production incidents are still going up anyway. I came across a stat saying teams using AI coding tools saw PR volume inc

๐ŸŸข AI adaptive capability Synthesis?? Thoughts? JL_Engine โ€” score 39 Sources: reddit/r/AIAgents

hey yall. iโ€™ve been thinking about how autonomous AI agents already operate differently than traditional software systems. normal software usually depends on fixed tools, predefined permissions, and predictable workflows. meanwhile there are already agent systems capable of dynamically creating work

๐ŸŸข Would love feedback for this tool that catches failures before deploying โ€” score 39 Sources: reddit/r/AIAgents

https://preview.redd.it/onlu1ys3jxzg1.png?width=638&format=png&auto=webp&s=c2e4078c2410a2e4b0ba63be1f49532e18223b76 Hey everyone I'm looking for AI agent builders to give feedback on Stratix SDK, an open-source Python SDK for proper pre-depl

๐ŸŸข All my clients wanted a carousel, now it's an AI chatbot โ€” score 25 Sources: hackernews

๐ŸŸข My experience interviewing with Huawei Vancouver for an ML research role: strong mismatch between how it was pitched and how it was evaluated [D] โ€” score 19 Sources: reddit/r/MachineLearning

I want to share an interview experience anonymously in case it helps others on the job market. I was approached about a Vancouver ML role that was presented to me as research-oriented. The recruiter told me the team had looked at my research and that I should be ready to discuss my projects, so I ex

Omitted 4 additional developer tools items from the main section; see raw data and source-specific sections below.

Infrastructure & Compute

๐ŸŸข MTP is all about acceptance rate โ€” score 23 Sources: reddit/r/LocalLLaMA

So I was very excited about the MTP stuff especially since Gemma4 has become my "daily driver" for some stuff. I grabbed the latest mlx-vlm and did some tests and found it disappointing. | Workload | MTP off | MTP on | Result | Draft accept rate | |---|---|---|---|---| | Code generation | 75 tok/s |

Research Papers

๐ŸŸข GeoStack: A Framework for Quasi-Abelian Knowledge Composition in VLMs โ€” score 20 Sources: huggingface

We address the challenge of knowledge composition in Vision-Language Models (VLMs), where accumulating expertise across multiple domains or tasks typically leads to catastrophic forgetting. We introduce GeoStack (Geometric Stacking), a modular framework that allows independently trained domain exper

Other Signals

๐ŸŸข Got MTP + TurboQuant running โ€” Qwen3.6-27B -- 80+ t/s at 262K context on a single RTX 4090 โ€” score 32 Sources: reddit/r/LocalLLaMA

So I've been messing around trying to get MTP working alongside TBQ4_0 (TurboQuant's lossless 4.25 bpv KV cache) on Qwen3.6-27B for my own use. So after a day of vibecoding I think I may have gotten something viable. Went from about 43 t/s when I first got it compiling to 80-87 t/s after optimizing

๐ŸŸข Can LLMs model real-world systems in TLA+? โ€” score 8 Sources: hackernews

๐ŸŸข Neurips : Pushing anonymous repo after rebuttal [D] โ€” score 6 Sources: reddit/r/MachineLearning

Hi everyone, I have a question about NeurIPS submission/review rules and anonymous code repositories. Suppose a paper was submitted before the deadline, and the anonymous code repo is linked as supplementary/reproducibility material. After the deadline, we notice that one label/name in the paper is

RepoDescriptionStars TodayLanguage
bytedance/UI-TARS-desktopThe Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra850typescript
datawhalechina/hello-agents๐Ÿ“š ใ€ŠไปŽ้›ถๅผ€ๅง‹ๆž„ๅปบๆ™บ่ƒฝไฝ“ใ€‹โ€”โ€”ไปŽ้›ถๅผ€ๅง‹็š„ๆ™บ่ƒฝไฝ“ๅŽŸ็†ไธŽๅฎž่ทตๆ•™็จ‹667python
earendil-works/piAI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods638typescript
anomalyco/opencodeThe open source coding agent.628typescript
rohitg00/agentmemory#1 Persistent memory for AI coding agents based on real-world benchmarks400typescript
Fission-AI/OpenSpecSpec-driven development (SDD) for AI coding assistants.316typescript
CopilotKit/CopilotKitThe Frontend Stack for Agents & Generative UI. React + Angular. Makers of the AG-UI Protocol215typescript
vercel-labs/skillsThe open agent skills tool - npx skills208typescript
colbymchenry/codegraphPre-indexed code knowledge graph for Claude Code โ€” fewer tokens, fewer tool calls, 100% local161typescript
ChromeDevTools/chrome-devtools-mcpChrome DevTools for coding agents145typescript

๐Ÿ“„ New Papers

TitleCategoryHotnessLink
Audio-Visual Intelligence in Large Foundation Modelsresearch_paper25Open
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interactionresearch_paper61Open
Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systemscs.AI0Open
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersectionscs.AI0Open
When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Modelscs.AI0Open
PRISM: Perception Reasoning Interleaved for Sequential Decision Makingcs.AI0Open
LaTA: A Drop-in, FERPA-Compliant Local-LLM Autograder for Upper-Division STEM Courseworkcs.AI0Open
From History to State: Constant-Context Skill Learning for LLM Agentscs.AI0Open
The Geopolitics of AI Safety: A Causal Analysis of Regional LLM Biascs.AI0Open
Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructurecs.AI0Open
Agentic Discovery of Exchange-Correlation Density Functionalscs.AI0Open
Intentionality is a Design Decision: Measuring Functional Intentionality for Accountable AI Systemscs.AI0Open
LANTERN: LLM-Augmented Neurosymbolic Transfer with Experience-Gated Reasoning Networkscs.AI0Open
FoodCHA: Multi-Modal LLM Agent for Fine-Grained Food Analysiscs.AI0Open
Housing Potential Common Data Model and City Digital Twincs.AI0Open

๐Ÿข Lab Blog Posts

๐Ÿฆ Twitter/X Highlights

AccountTweet Summary
OpenAIChain of thought monitors are a key layer of defense against AI agent misalignment. To preserve monitorability, we avoid penalizing misaligned reasoning during RL. We found a limited amount of accidental CoT grading which affected released models, and are sharing our analysis. https://alignment.open Post

Repeated From Recent Briefings