π΄ High Significance
Model Releases
π΄ Qwen Who? DiffusionGemma running at 1,500 tk/s on a Digital Pregnancy Test. β score 96
Sources: reddit/r/LocalLLaMA
First Doom, now DiffusionGwmma 4. We are truly living in the future. Who even needs a new Qwen release anymore? /s (Satire - Shaq doesnβt actually make a digital pregnancy test capable of running diffusion-based LLMs) Credit to Obvious Plant for the original Shaq pregnancy test box (that I doctored
π΄ Gemma 4 Quadruple Release, 12B, 12B QAT, 26B-A4B QAT and 31B QAT Uncensored Heretics! β score 89
Sources: reddit/r/LocalLLaMA
gemma-4-31B-it-qat-q4_0-unquantized-uncensored-heretic: Safetensors: https://huggingface.co/llmfan46/gemma-4-31B-it-qat-q4_0-unquantized-uncensored-heretic GGUF: [https://huggingface.co/llmfan46/gemma-4-3
π΄ Claude Fable is relentlessly proactive β score 70
Sources: hackernews
Developer Tools
π΄ We spent decades fixing software deployment. Why are we letting AI agents break it all over again? β score 94
Sources: reddit/r/AIAgents
Iβve been spending a lot of time setting up multi-agent workflows lately, and I canβt shake the feeling that we are aggressively re-inventing a bunch of structural problems that software engineering spent thirty years solving. it kinda feels like business bro's are creating a problem so that they ca
π΄ AI agent bankrupted their operator while trying to scan DN42 β score 90
Sources: hackernews
π΄ hexo-ai/sia β SIA is a Self Improving AI framework to autonomously improve the performance of any AI system (Model / Agent) on a benchmark task. β score 76
Sources: github_trending
SIA is a Self Improving AI framework to autonomously improve the performance of any AI system (Model / Agent) on a benchmark task.
π΄ Can you realistically start an automation business without a lot of money? β score 72
Sources: reddit/r/AIAgents
I've been thinking about getting into business automation, but most of the content I see makes it sound like you need a bunch of paid tools, subscriptions, software, ads, and a whole setup before you can even get started. For those of you who actually do automation for clients: Can someone start wit
Research Papers
π΄ HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers β score 82
Sources: huggingface Β· arxiv/cs.AI
Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representation space. In this paper, we present HYDRA-X, the first UMM that unifies image and video tokenization within a single Vision Transformer (ViT). Our design is dri
π΄ From 2D Grids to 1D Tokens: Reforming Shared Representations for Multimodal Image Fusion β score 75
Sources: huggingface
Multimodal image fusion aims to integrate complementary information from different modalities into a fused image that preserves rich local details while maintaining globally consistent appearance. Existing approaches build shared representations on 2D feature grids, which excel at modeling local str
Other Signals
π΄ What models you guys running on 8GB? 16GB VRAM? 24GB? 32GB? 48GB? β score 82
Sources: reddit/r/LocalLLaMA
And what are you using for kv cache and context? What kind of performance are you getting? What is your hardware? And what are you using your models for? I figure with how fast everything moves, its worth asking once in a while to congeal our experiences.
π΄ New models released: Nex-N2 Pro 397B and Nex-N2 Mini 35B β score 75
Sources: reddit/r/LocalLLaMA
They are FTs of Qwen3.5 and the benchmarks look pretty good https://huggingface.co/nex-agi/Nex-N2-mini https://huggingface.co/nex-agi/Nex-N2-Pro
π‘ Notable
Model Releases
π‘ I distilled my 12 year experience as a product manager and built a free skill that takes you from "I have an app idea" to a real plan and solid MVP β score 63
Sources: reddit/r/AIAgents
I'm a PM. 12 years, mostly zero-to-one. I built a free skill that does the part of app-building everyone skips and then regrets. It's called vibe-check. Open-source, drops into Claude, Codex, or Antigravity. It doesn't write your code. AI does that now. It does the harder thing that comes before the
π‘ EAGLE3 has landed in llama.cpp β score 61
Sources: reddit/r/LocalLLaMA
After half a year of development, EAGLE3 has been merged into llama.cpp. EAGLE3 is similar to MTP, but different: the helper model gets extra guidance from the main model instead of guessing completely on its own.
π‘ is Gemini your main AI model today, or just a secondary option β score 61
Sources: reddit/r/AIAgents
I recently had a discussion with a friend who strongly prefers Gemini and Google products in general , his argument is that Google has access to massive amounts of data and arguably the best search engine in the world, so Gemini should have a significant advantage my opinion and experience has been
π‘ PSA: Test your "threads" argument in llama.cpp (+80% performance in my case) β score 54
Sources: reddit/r/LocalLLaMA
When GPT-OSS 120B has released last year I played around and tried to maximize it's performance. One thing that many people pointed out was that for hybrid CPU (Performance + Efficiency cores) you should use only P-cores with "--threads" argument and taskset/affinity. Back then I've setup that model
π‘ Anthropic apologizes for invisible Claude Fable guardrails β score 50
Sources: hackernews
Omitted 5 additional model releases items from the main section; see raw data and source-specific sections below.
Developer Tools
π‘ I put a hidden instruction in a document. My AI agent followed it. Hereβs the repo. β score 50
Sources: reddit/r/AIAgents
Cloned a repo, ran an agent against a βresearch report,β watched it comply with instructions embedded in the document instead of summarizing it. The attack is in the repo. Run it yourself. Then run the protected version with Arc Gate and watch it get blocked. https://github.com/9hannahnine-jpg/vulne
π‘ @OpenAI: We heard you wanted to use Codex rate limit resets on your own time. Starting today, weβre rolling out the ability to save rate limit resets to use later. Weβre starting Go, Plus, Pro, and Business β score 50
Sources: twitter_rss
We heard you wanted to use Codex rate limit resets on your own time. Starting today, weβre rolling out the ability to save rate limit resets to use later. Weβre starting Go, Plus, Pro, and Business users with one free reset:
π‘ @xai: Install the @sentry plugin and ask your agent to find and fix errors, analyze stack traces, and triage alerts β score 50
Sources: twitter_rss
Install the @sentry plugin and ask your agent to find and fix errors, analyze stack traces, and triage alerts
π‘ @xai: Use the @vercel plugin to deploy to production, spin up sandboxes, or build apps with Shadcn. β score 50
Sources: twitter_rss
Use the @vercel plugin to deploy to production, spin up sandboxes, or build apps with Shadcn.
Research Papers
π‘ Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback β score 65
Sources: huggingface
Despite generating increasingly photorealistic images, text-to-image (T2I) models still exhibit localized, subtle, and structurally complex failures. Diagnosing these failures requires instance-level feedback that answers where a defect occurs, what type it is, why it is defective, and its importanc
π‘ ArogyaSutra: A Multi-Agent Framework for Multimodal Medical Reasoning in Indic Languages β score 45
Sources: huggingface Β· arxiv/cs.AI
Multimodal Large Language Models (MLLMs) have shown promising reasoning capabilities in general domains, yet their performance remains limited in specialized settings such as healthcare, especially in multilingual and low-resource scenarios. This gap is critical in regions like rural India, where pa
Other Signals
π‘ Is Symbolic Regression still a thing, given LLMs' performance? [D] β score 69
Sources: reddit/r/MachineLearning
I've been teaching myself about Symbolic Regression (SR), which looks like a super exciting field. (A great intro resource below [1]). But then I was wondering: given LLMs' increasingly-growing power in generating code, which is in a way very similar to Symbolic Regression (or of course, even dire
π‘ Having some fun with LMX-Omni-52B-Halo in Open WebUI β score 68
Sources: reddit/r/LocalLLaMA
π‘ @GoogleDeepMind: Pinned: Weβre teaming up @Palmeiras, the first football club to meaningfully build upon TacticAI: our AI system that can help simulate field scenarios and predict open play dynamics up to 8 seconds in β score 50
Sources: twitter_rss
Pinned: Weβre teaming up @Palmeiras, the first football club to meaningfully build upon TacticAI: our AI system that can help simulate field scenarios and predict open play dynamics up to 8 seconds in advance. β½
π‘ Huawei Released openPangu 2.0 (Will open source on June 30) β score 46
Sources: reddit/r/LocalLLaMA
At the Huawei Developer Conference (HDC 2026) held on June 12, Richard Yu, Executive Director of Huawei, officially launched the brand-new, open-source Pangu large modelβopenPangu 2.0. The model is fully adapted to the HarmonyOS ecosystem and has achieved deep optimization and performance breakthrou
π‘ Post-docs in ML [D] β score 44
Sources: reddit/r/MachineLearning
Are there any websites listing post-doc job opening in machine learning? Currently I'm using LInkedIn to search for these. When I was a math post-doc, everyone used "MathJobs.org" to find jobs. Is there a similar website for machine learning? Thanks.
π’ Incremental
Model Releases
π’ Are AI agents making traditional software interfaces obsolete? β score 33
Sources: reddit/r/AIAgents
i was reading an enterprise tech trend report for 2026 and it got me thinking about how quickly the traditional SaaS GUI (graphical user interface) is losing its utility. for the last fifteen years, software design has been about building pretty, siloed dashboards. weβve built our entire workflows a
π’ Claude Fable 5: mid-tier results on coding tasks β score 30
Sources: hackernews
π’ πPP-OCRv6 is officially released ! β score 25
Sources: reddit/r/LocalLLaMA
π₯PaddleOCRβs new OCR model series scales from 1.5M to 34.5M parameters, bringing stronger accuracy, faster inference, and broader deployment options β from browsers and edge devices to servers. πWhatβs new: πΈTiny / Small / Medium models: 1.5M, 7.7M, 34.5M params πΈ+4.9% detection accuracy and +5.1% r
π’ Has anyone noticed that the behavior of the Kimi model has changed? β score 11
Sources: reddit/r/LocalLLaMA
I have been using Kimi K2.6 in Kimi Code for a while. Although it can complete most tasks, it often requires a long time to think and try. Today the model's CoT has become very short and concise, and it feels much improved on coding tasks compared to before I heard that GLM 5.2 is also about to be r
Developer Tools
π’ mlflow/mlflow β The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data. β score 31
Sources: github_trending
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.
π’ always-further/nono β Capability-based agent runtime with fine-grained policies . Brokering access directly within the agent's operating context, with zero setup and zero latency β score 18
Sources: github_trending
Capability-based agent runtime with fine-grained policies . Brokering access directly within the agent's operating context, with zero setup and zero latency
π’ How would you start selling automations? Where would you even begin? β score 17
Sources: reddit/r/AIAgents
Iβm getting into building automations for businesses, but Iβm a bit stuck on the first step. Like, I can imagine building solutions for repetitive work, internal processes, data entry, reporting, customer stuff, etcβ¦ but I donβt really know how people actually start selling this. So Iβm curious: If
π’ anthropics/claude-agent-sdk-python β score 13
Sources: github_trending
π’ Building an Open Source Edge Semantic Cache for LLMs in Rust/WASM β Sanity check on the architecture? [D] β score 12
Sources: reddit/r/MachineLearning
Hey everyone, I am planning out a new open-source infrastructure project and want to get some brutal feedback on the architecture and use-case validity from people running high volume LLM workloads in production. The Problem: Python-based proxies/gateways introduce too much latency overhead for
Omitted 1 additional developer tools items from the main section; see raw data and source-specific sections below.
Research Papers
π’ Revisiting Articulated Parts Perception in Robot Manipulation β score 20
Sources: huggingface
We are surrounded by various objects with movable, articulated parts, e.g., box, handle, door. An accurate and generalizable perception of articulated parts is essential to enhance robotic manipulation capabilities. Building on this need, recent efforts in articulated parts perception have followed
π’ Leveraging Morphology for Historical Script Metrological Analysis β score 20
Sources: huggingface
Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but still provide limited access to interpretable visual measurements for paleography, the study of historical scripts. In this paper, our main insight is that morphological script analysis, in p
Other Signals
π’ Best LLM for smut stories β score 39
Sources: reddit/r/LocalLLaMA
I'm trying to find the best LLM for writing erotica/smut, but there doesn't seem to be that many good models right now. I'm using Cydonia 24B v4.3, which gives great results, but I was wondering if there were even better models that could fit into 16GB VRAM with quantization. Sadly there doesn't see
π’ Spent $3 running 4x4090 benchmarks for llama 3 70b (exl2 vs gguf). exl2 generation speed is kind of ridiculous. β score 33
Sources: reddit/r/AIAgents
Hey guys, so I wanted to run some heavy benchmarks comparing GGUF and EXL2 for Llama-3-70B on a 4x4090 setup. single card data is everywhere but 4 way tensor parallel stats are hard to find . The problem is I dont own a 4x4090 rig and normally renting one would immediately eat into my monthly budget
π’ LLM context compression at 16x beats KV cache β score 32
Sources: reddit/r/LocalLLaMA
π’ MICCAI 2026 Results [D] β score 31
Sources: reddit/r/MachineLearning
Results are almost here. Good luck to everyone waiting for the final decision π
π’ Why hasn't any mainstream game integrated LLMs into NPCs yet? β score 18
Sources: reddit/r/LocalLLaMA
tech demos exist but nothing's actually shipped in a real game. Is it a latency problem or are game studios just not interested~
Omitted 3 additional other signals items from the main section; see raw data and source-specific sections below.
π Trending Repos
| Repo | Description | Stars Today | Language |
|---|---|---|---|
| hexo-ai/sia | SIA is a Self Improving AI framework to autonomously improve the performance of any AI system (Model / Agent) on a benchmark task. | 199 | python |
| mlflow/mlflow | The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data. | 24 | python |
| always-further/nono | Capability-based agent runtime with fine-grained policies . Brokering access directly within the agent's operating context, with zero setup and zero latency | 12 | rust |
| anthropics/claude-agent-sdk-python | 10 | python |
π New Papers
| Title | Category | Hotness | Link |
|---|---|---|---|
| HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers | research_paper | 20 | Open |
| From 2D Grids to 1D Tokens: Reforming Shared Representations for Multimodal Image Fusion | research_paper | 10 | Open |
| Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback | research_paper | 7 | Open |
| ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs | cs.AI | 0 | Open |
| Arbor: Tree Search as a Cognition Layer for Autonomous Agents | cs.AI | 0 | Open |
| Strategic Decision Support for AI Agents | cs.AI | 0 | Open |
| Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation | cs.AI | 0 | Open |
| PersonaDrive: Human-Style Retrieval-Augmented VLA Agents for Closed-Loop Driving Simulation | cs.AI | 0 | Open |
| "Did you lie?" Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms | cs.AI | 0 | Open |
| TrajGenAgent: A Hierarchical LLM Agent for Human Mobility Trajectory Generation | cs.AI | 0 | Open |
| Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents | cs.AI | 0 | Open |
| From AGI to ASI | cs.AI | 0 | Open |
| Deployment-Centered Evaluation: Predicting Query-Level Rejection Risk in a Clinical LLM System | cs.AI | 0 | Open |
| Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI | cs.AI | 0 | Open |
| The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism | cs.AI | 0 | Open |
π’ Lab Blog Posts
π¦ Twitter/X Highlights
| Account | Tweet Summary |
|---|---|
| AnthropicAI | Weβre launching Claude Corps, a national fellowship program matching people early in their careers with US nonprofits. We'll teach 1,000 people to use Claude, and pay them to use AI to advance their hostsβ missions. https://www.anthropic.com/claude-corps Post |
| OpenAI | We heard you wanted to use Codex rate limit resets on your own time. Starting today, weβre rolling out the ability to save rate limit resets to use later. Weβre starting Go, Plus, Pro, and Business users with one free reset: Post |
| GoogleDeepMind | Pinned: Weβre teaming up @Palmeiras, the first football club to meaningfully build upon TacticAI: our AI system that can help simulate field scenarios and predict open play dynamics up to 8 seconds in advance. β½ Post |
| GoogleDeepMind | When millions of AI agents interact with each other, new collective behaviors can emerge. π Together with @schmidtsciences, @coop_ai, @ARIA_research and supported by @GoogleOrg, weβre launching a $10M research fund to help understand how AI systems behave as a group. β https://goo.gle/3Si6rCl Post |
| xai | Install the @sentry plugin and ask your agent to find and fix errors, analyze stack traces, and triage alerts Post |
| xai | Use the @vercel plugin to deploy to production, spin up sandboxes, or build apps with Shadcn. Post |
| simonw | After two days with Claude Fable 5 the best way I can describe it is "relentlessly proactive" - here's an example where I dropped in a screenshot of a bug and it span up custom CORS Python servers and used pyobjc-framework-Quartz to capture screenshots https://simonwillison.net/2026/Jun/11/fable-is- Post |
| simonw | New Datasette release: 1.0a33, which finally brings documents the ?_extra= JSON API mechanism and brings it to the row and query pages in addition to the table pages (Most of the code in this release was built with the help of Claude Fable 5) https://datasette.io/blog/2026/api-extras/ Post |
Repeated From Recent Briefings
- mvanhorn/last30days-skill β AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary - first seen 2026-06-05
- Anthropic's new model Fable will silently handicap work on LLMs [D] - first seen 2026-06-11
- anomalyco/opencode β The open source coding agent. - first seen 2026-05-09
- FareedKhan-dev/train-llm-from-scratch β A straightforward method for training your LLM, from downloading data to generating text. - first seen 2026-06-11
- maziyarpanahi/openmed β open-source healthcare ai - first seen 2026-06-10
- activeloopai/hivemind β One brain for all your agents - first seen 2026-06-11
- VIA-SD: Verification via Intra-Model Routing for Speculative Decoding - first seen 2026-06-11
- NVIDIA/SkillSpector β Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks. - first seen 2026-06-10
- yikart/AiToEarn β Let's use AI to Earn! - first seen 2026-05-11
- karpathy/autoresearch β AI agents running research on single-GPU nanochat training automatically - first seen 2026-05-21
- ... plus 419 more repeated items in processed data