๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด Anyone else tired of the โ€œClaude just dropped OPUS 4.8 weโ€™re all cookedโ€ posts on LinkedIn? โ€” score 94 Sources: reddit/r/AIAgents

Every time a new Claude releases, the same cycle plays out. LinkedIn and X are flooded with "I just fired my appointment setter, my graphic designer, my copywriter..." followed by Comment AI and I'll send you the full prompt, and next thing you know you're getting pitched an AI course. The problem i

๐Ÿ”ด nvidia/Qwen3.6-35B-A3B-NVFP4 ยท Hugging Face โ€” score 90 Sources: reddit/r/LocalLLaMA

The NVIDIA Qwen3.6-35B-A3B-NVFP4 model is the quantized version of Alibaba's Qwen3.6-35B-A3B model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check here. The NVIDIA Qwen3.6-3

Infrastructure & Compute

๐Ÿ”ด My home data center โ€” score 70 Sources: reddit/r/LocalLLaMA

System 1: Threadripper 3960x 24c 4x 3090 ti 128gb ddr4 System 2: Xeon 8352 36c 4x 5070 ti 128gb ddr4 System 3: Intel 14700k 24c 64gb ddr5 5090 System 4: Ryzen 5950x 16c 64gb ddr4 2x 5070 ti The first system uses two PSUs to handle the almost 2000w full load of the 3090s. Was nervous about this but i

Other Signals

๐Ÿ”ด Someone out there likely needs this โ€” score 97 Sources: reddit/r/LocalLLaMA

๐Ÿ”ด 125 tok/s for Qwen3.6 q4xl on 2x 4060ti is insane perf/dollar โ€” score 83 Sources: reddit/r/LocalLLaMA

Under $1000 for 32gb vram from 2023, and ~300 watts draw... and this thing is outperforming the latest pick-your-vendor $5k mini pcs from 2026. So.. next question is can I make it squeeze 150 t/s with the same q4xl on cuda 13.3 this weekend. Anyone try it yet? **Edit** llamacpp ini/flags: podma

๐Ÿ”ด Email is still the highest ROI channel for SaaS retention and the tooling hasn't caught up with that reality โ€” score 83 Sources: reddit/r/AIAgents

I went down a rabbit hole on this recently so sharing what I found. Email consistently shows up at the top of ROI benchmarks for direct user communication. Studies across SaaS companies repeatedly show that behavior-triggered emails, specifically ones tied to actual in-app actions, outperform broadc

๐Ÿ”ด It's funny how everything changes, yet somehow stays the same. โ€” score 77 Sources: reddit/r/LocalLLaMA

๐ŸŸก Notable

Model Releases

๐ŸŸก mudler/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUF just released ! โ€” score 43 Sources: reddit/r/LocalLLaMA

Description of the module: I host 30+ free APEX MoE quantizations as independent research. My only local hardware is an NVIDIA DGX Spark (122 GB unified memory) โ€” enough for ~30-50B-class MoEs, but bigger ones (200B+) require rented compute on H100/H200/Blackwell, typically $20-100 per

Developer Tools

๐ŸŸก Been pen-testing AI agents for a while. Same flaws everywhere. Offering a few free audits โ€” score 61 Sources: reddit/r/AIAgents

Spent the last several months breaking AI agents โ€” the kind that actually do stuff, not chatbots that just talk. Customer support bots with refund powers. Internal agents hooked into Slack, Jira, prod databases. Coding assistants with write access. It's wild how similar the failures are. Different s

๐ŸŸก how do you catch agent regressions after a change? built a small tool for this โ€” score 61 Sources: reddit/r/AIAgents

fix a failure, ship a new prompt or model, same failure quietly returns. been hitting this a lot. replayd is a small SDK that captures failed runs as tests and replays them before deploy. v0.1.0, early, but works end to end. not a big breakthrough, just a focused tool for one problem. putting it out

๐ŸŸก All DGX Station GB300 OEM systems side-by-side in one image (roughly actual size) โ€” score 50 Sources: reddit/r/LocalLLaMA

!^(Except for HP which I had to guesstimate from some Chinese guy's pic at a showcase because the ZGX Fury AI Station G1N's official page is locked down. Same reason why I didn't include Nvidia's)!< Most underrated LLM system of 2026 and nothing comes close Assuming budget is infinitely d

๐ŸŸก Bayesian Opt. GPs vs Linear models and Neural Networks for parameter optimizations [R] โ€” score 44 Sources: reddit/r/MachineLearning

Hi, Relatively new to deep learning. I wanted some opinions on which of these approaches might be best for time series data and spectral analysis. I currently use a GP and it works pretty well, but Iโ€™m wondering what the computational tradeoffs and so forth might be. Any ideas?

Infrastructure & Compute

๐ŸŸก Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows ) โ€” score 57 Sources: reddit/r/LocalLLaMA

๐ŸŸก Why do the output layer weights become word vectors in Word2Vec? [D] โ€” score 56 Sources: reddit/r/MachineLearning

I'm trying to understand the intuition behind Word2Vec training using a neural network. In Word2Vec (CBOW or Skip-gram), we often hear that the weight matrices learned during training contain the vector representations (embeddings) of words. However, I don't understand why the weights of the hidden-

Business & Funding

๐ŸŸก Cost Analysis of my $6.4k Local LLM Server โ€” score 63 Sources: reddit/r/LocalLLaMA

I haven't seen any of these done, so I just wanted to share my experience in case it is useful for anyone. The purpose of this post is to show total cost of ownership of my local llm server versus API equivalent. Before you look at the final numbers, note that most people do not do proper financial

Other Signals

๐ŸŸก switched email tools three times in one year and i have thoughts โ€” score 61 Sources: reddit/r/AIAgents

first one was too basic, couldn't segment customers properly second one was powerful but every time i wanted to do something slightly custom i needed to call support or hire someone third one looked great in the demo and then the onboarding was a nightmare i think the problem is most of these tools

๐ŸŸข Incremental

Model Releases

๐ŸŸข Built a minimalist coding agent optimized for memory footprint โ€” score 28 Sources: reddit/r/AIAgents

Hi everybody, I spent the last two weeks building [zerostack](https://gi-dellav.github.io/zerostack/), a coding agent in Rust, focused on memory footprint. I managed to get it to run at ~16MB (with peaks of 24MB) of RAM usage, and no CPU usage when idle.

๐ŸŸข Best small model right now (~4B params) that is good with agentic tasks for personal assistant? โ€” score 23 Sources: reddit/r/LocalLLaMA

Looking for suggestions. I have been experimenting with gemma-4-E2B and gemma-4-E4B but the tool calling has been not the best? My tasks are just things like: * Update calendar * Get my schedule * Send a WA message at 4PM etc. Any suggestions? If it helps, here are my server params: ` ./llama-server

๐ŸŸข Made a program using LocalLLM based on llama.cpp for fellow Book Lovers! โ€” score 10 Sources: reddit/r/LocalLLaMA

TL;DR: I built an Ebook reader embedded with a compact translation model. Hi! I know this post has a promotional nature, but it contains a concept that I believe readers who love books will appreciate, so please take a look. While talking to an AI developer from an English-speaking country livin

๐ŸŸข Speed difference between Windows 11 and Linux with llama.cpp: a myth when using medium and large MoE models โ€” score 7 Sources: reddit/r/LocalLLaMA

As the title says, there is no speed difference between Linux and Windows when using llama.cpp. I myself kept two operating systems on my computer for a long time because of this misconception. But when I got tired of constantly switching, I decided to check how much performance Iโ€™d lose if I moved

Developer Tools

๐ŸŸข An agentic bank - built, run, and overseen by agents. Banking other agents โ€” score 28 Sources: reddit/r/AIAgents

A thought experiment. What if you don't transform anything, and start a bank with only agents. Not agentic transformation. A full universal bank... built, run, and overseen by agents. Banking other agents. Play around, see what the agents do: bank.unsol.dev Next up, probably the OCC charter and FINR

๐ŸŸข Frona v2026.5.5 โ€“ self-hosted personal AI assistant โ€” score 28 Sources: reddit/r/AIAgents

Hey, Since LLM tool calling became a thing, the dominant pattern has been: ship an AI assistant that can execute code, browse the web, and hit your APIs, and figure out the security story later. Frona started as a pushback against that pattern. Frona is a personal AI assistant you self-host. You cre

๐ŸŸข Use any model and any provider with the official OpenAI Codex Desktop App, without modifying its code, and continue to use the official models in parallel? โ€” score 17 Sources: reddit/r/LocalLLaMA

All in the title. The official OpenAI Codex Desktop App only accepts models that are from OpenAI and from a curated list. But there is a trick, you can make it think you are using the official OpenAI servers and models by just impersonating the model name and using another one. There is three things

๐ŸŸข Donโ€™t bite me for that question pleaseโ€ฆ โ€” score 7 Sources: reddit/r/LocalLLaMA

And question isโ€ฆ How you earning money on your local llm setups? (Except coding ofc) I see people spending SO MUCH MONEY on the compute power to run llms locally and many of them saying that their setups already payed themselves or they earning much more (I guess they not mean that they saves the sa

Business & Funding

๐ŸŸข [PSA] 5060ti 16GB for $300.99. 5070ti 16GB for $699.99. Best Buy in store clearance. โ€” score 37 Sources: reddit/r/LocalLLaMA

The 5060ti 16GB(SKU 6630626) has been on clearance for a couple of weeks in Best Buy stores for $419.99. A couple of days ago, it dropped to $300.99. The 5070ti 16GB(SKU 6620367) has been on clearance for $699.99. Not all stores will have these prices. Some still have the 5060ti for $419.99 still. T

๐ŸŸข Has Anyone Built an Automated Income Stream That Generates Revenue With Little to No Ongoing Work? โ€” score 6 Sources: reddit/r/AIAgents

Iโ€™m curious to hear real-world examples from people who have built something that generates revenue mostly on autopilot. Iโ€™m not talking about investing in stocks or real estate. I mean businesses, websites, tools, digital products, AI workflows, bots, newsletters, content systems, affiliate sites,

Other Signals

๐ŸŸข Workshop submission for main conference paper under review [D] โ€” score 19 Sources: reddit/r/MachineLearning

I have an ECCV paper main conf. Can I submit the same to a workshop at some other place happening before ECCV? The other workshop (non archival) will be after the final decisions of eccv come. Under any result in eccv- acceptance or rejection, how will this affect it? Im not the main author at eccv.

๐ŸŸข Query about non-archival workshop at CVPR-2026 [R] โ€” score 19 Sources: reddit/r/MachineLearning

My paper was recently accepted to a workshop at CVPR-2026 as non-archival acceptance. Is it mandatory for me to register to the conference as I won't be able to attend(visa issues), but my friend will be there in the conference and can present on my behalf. I have few questions regarding my situatio

๐ŸŸข I built mlx-Chronos โ€” a community benchmark leaderboard for local LLM engines on Apple Silicon (oMLX, Rapid-MLX, mlx-lm, Ollama) [P] โ€” score 0 Sources: reddit/r/MachineLearning

Hey! I'm a CS student and I got tired of not being able to compare MLX inference engines properly โ€” every benchmark out there is either made by the engine's own developers, runs on an M3 Ultra nobody has, or just shows tok/s with zero context. So I built mlx-Chronos โ€” a small open source CLI tool th

Repeated From Recent Briefings