π΄ High Significance
Model Releases
π΄ The captcha arms race is making autonomous web tasks practically impossible β score 94
Sources: reddit/r/AIAgents
Im working on a fairly standard procurement agent for a client right now. the stack is just python, playwright, and gpt-4o for vision and dom parsing. the reasoning logic works perfectly on my local machine, but the second I deploy it to a cloud server, it just gets absolutely obliterated by bot pro
π΄ I released Inflect-Nano, an ultra-extreme tiny 4.63m parameter TTS model. β score 89
Sources: reddit/r/LocalLLaMA
Iβve been experimenting with how small a usable neural TTS model can realistically get, and I just released Inflect-Nano-v1. Inflect-Nano is one of the smallest TTS models, and it performs surprisingly well for its model weight. Even if you have a certified potato computer, it can run on that. I
π΄ Your most useful AI so far? (can be a tool or an agent) β score 83
Sources: reddit/r/AIAgents
ngl AI has really helped me boost my productivity and the more I go deeper with it the more it gets more interesting! I started discovering AI agents wherein you combine all the tools through LLMs, API and such in the company I am working with right now and so far I wish I learned it for a long time
π΄ A robot is sprinting towards you. Do you want it running on Claude or Grok? β score 83
Sources: hackernews
π΄ We need a 80-160B model urgently. The unified memory device market needs more Models. β score 82
Sources: reddit/r/LocalLLaMA
Hello guys, I will keep myself short. There are so many people that have a lot but not enough of "slow" RAM. Anybody with a Apple Device with >96GB Anybody with a Ryzen AI 395 Device with >96GB Anybody with a DGX Spark Even people with RTX 6000 Pros or 4x3090s or other configurations. Or P
Developer Tools
π΄ GLM-5.2 is a win for local AI β score 96
Sources: reddit/r/LocalLLaMA
I know GLM 5.2's massive 753B footprint means none of us are running it at home without an enterprise cluster, but having a true frontier-level, MIT-licensed coding agent out in the wild makes me optimistic. The distillation potential here is massive. Once the community starts fine-tuning smaller 8B
π΄ Multivariate Probability Models in Machine Learning [D] β score 81
Sources: reddit/r/MachineLearning
Hello Folks, we start our discussion on Lecture 10 of Probabilistic Machine Learning, now starting with Probability Multivariate Models. Univariate models are toy cases, in real life, ML models are multivariate. To understand dependence of more than one variables on each other we study ideas as Cova
π΄ Full Hermes setup guide with Lm studio β score 72
Sources: reddit/r/AIAgents
Research Papers
π΄ SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior β score 70
Sources: huggingface Β· arxiv/cs.LG
Sparse Autoencoders (SAEs) decompose residual-stream activations into interpretable features. Recent latent-space defenses increasingly rely on these decompositions, assuming that identified "unsafe" SAE features serve as actionable handles for monitoring and intervention. In this paradigm, clamping
Other Signals
π΄ PSA: unsloth/GLM-5.2-GGUF is uploading β score 75
Sources: reddit/r/LocalLLaMA
Went to check Unsloth's HF to see if they uploaded GLM-5.2 GGUFs, and found the repo was created half an hour ago. It only has the readme for now. I suspect GGUFs are uploading
π‘ Notable
Model Releases
π‘ @OpenAI: Introducing LifeSciBench, a benchmark for measuring and improving how well AI supports real-world life science research. Developed with 173 scientists from biotechnology and pharmaceutical research, β score 60
Sources: twitter_rss
Introducing LifeSciBench, a benchmark for measuring and improving how well AI supports real-world life science research. Developed with 173 scientists from biotechnology and pharmaceutical research, LifeSciBench includes 750 expert-authored tasks across seven biological research workflows. https://o
π‘ llama.cpp now supports model management (downloading etc) via API β score 54
Sources: reddit/r/LocalLLaMA
#23976 got merged a couple hours ago, which means llama.cpp can now not only load/unload models on demand from a directory, but also download them on demand. No UI yet, but that's coming pretty soon. This means you can now deploy llama.cpp, expose
π‘ Local Qwen isn't a worse Opus, it's a different tool β score 50
Sources: hackernews
π‘ Lin Junyang AI Lab Closes Round at $2B Valuation β score 46
Sources: reddit/r/LocalLLaMA
A new lab from Lin Junyang can only be good news for open source / weights, I think. Excited to see what the lead responsible for the Qwen line does next.
Developer Tools
π‘ ACL 2026 first author with weak GPA. How should I approach PhD applications? [D] β score 69
Sources: reddit/r/MachineLearning
Hi everyone, I have a fairly weak undergraduate: a 3.3/5 GPA in Computer Engineering from an average Nigerian university. For my Master's, I studied Artificial Intelligence at an average European university, where I finished with an 8/10 GPA. A condensed version of my Master's thesis was recently ac
π‘ bytedance/UI-TARS-desktop β The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra β score 68
Sources: github_trending
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
π‘ infiniflow/ragflow β RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs β score 61
Sources: github_trending
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
π‘ calesthio/OpenMontage β World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio. β score 57
Sources: github_trending
World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio.
π‘ roboflow/rf-detr β RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning. [ICLR 2026] β score 51
Sources: github_trending
RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning. [ICLR 2026]
Omitted 3 additional developer tools items from the main section; see raw data and source-specific sections below.
Infrastructure & Compute
π‘ Is foundational AI research still something that can be done without access to HPC? [D] β score 50
Sources: reddit/r/MachineLearning
I'm not that well versed in ML yet. I know that "Attention is all you need" was based on work that was done with a couple of high end gaming GPUs at the time. I can afford that. Suppose for arguments sake that I have caught up on ML such that I have the competence to recreate state of the art result
Business & Funding
π‘ Leaked financial docs show OpenAI is losing billions of dollars a year β score 68
Sources: reddit/r/LocalLLaMA
Research Papers
π‘ Native Active Perception as Reasoning for Omni-Modal Understanding β score 62
Sources: huggingface Β· arxiv/cs.CL
Passive models for long video understanding typically rely on a "watch-it-all" paradigm, processing frames uniformly regardless of query difficulty, causing computational cost to grow with video duration. Although interactive frameworks have emerged, they often rely on global pre-scanning, and their
π‘ Sumi: Open Uniform Diffusion Language Model from Scratch β score 58
Sources: huggingface Β· arxiv/cs.CL
Diffusion models have become a promising alternative to autoregressive models. Among these, uniform diffusion language models (UDLMs) permit any token to be updated at any step, in principle enabling more flexible generation. However, no UDLM has yet been pretrained from scratch at both large parame
π‘ Learning User Simulators with Turing Rewards β score 45
Sources: huggingface Β· arxiv/cs.CL
Learning to simulate human users in interactive settings could advance the training of agent assistants, evaluation of personalization systems, research in the social sciences, and more. Existing approaches generally do so by training a large language model (LLM) to match a single ground truth respo
Other Signals
π‘ llama.cpp - how to free up even more space on your GPU β score 61
Sources: reddit/r/LocalLLaMA
For the past week or two, llama.cpp has been working much better from the RAM usage prespective. I no longer see any memory leaks, and everything fits nicely on the GPU - my defaults are --n-gpu-layers 99 --no-mmap --mlock to avoid using the regular RAM, since I use my 3090 with an eGPU setup: Q
π‘ @xai: Grok is now available on Amazon Bedrock. AWS developers can now build with Grok 4.3, the industry leader in hallucination rate and tool calling, powered by Bedrockβs secure inference engine. β score 60
Sources: twitter_rss
Grok is now available on Amazon Bedrock. AWS developers can now build with Grok 4.3, the industry leader in hallucination rate and tool calling, powered by Bedrockβs secure inference engine.
π‘ @GoogleDeepMind: Weβre working with @SciTechgovuk, >@mhclg and @i_dot_ai on a new AI housing application planning prototype. π‘ By cutting down the time spent on repetitive tasks, it could help planning officers focus β score 50
Sources: twitter_rss
Weβre working with @SciTechgovuk, >@mhclg and @i_dot_ai on a new AI housing application planning prototype. π‘ By cutting down the time spent on repetitive tasks, it could help planning officers focus their attention on complex projects and reduce processing times by up to 50%. β https://goo.gle/4xzq
π’ Incremental
Model Releases
π’ Quick thoughts on GLM-5.2 (Bonus: Censorship question answers) β score 11
Sources: reddit/r/LocalLLaMA
I've been working with GLM-5.2 pretty much non-stop since it was released as an API. So yeah, take it with a grain of salt as API inference is not perfectly controllable. I'm calling it through Z.ai - so I'd like to think that it's a high quality iteration of the model, but I can't k
Developer Tools
π’ promptfoo/promptfoo β Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic. β score 36
Sources: github_trending
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
π’ microsoft/RD-Agent β Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. πhttps://aka.ms/RD-Agent-Tech-Report β score 31
Sources: github_trending
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-d
π’ Any idea if AAAI will be harsh on computer vision paper as last year? [R] β score 19
Sources: reddit/r/MachineLearning
Hello everyone, I have a computer vision paper ready for submission, a coauthor have suggested submitting it to AAAI. However last year computer vision papers have gotten a very small acceptance rate at AAAI, with reviewers receiving emails to specifically tell them that the acceptance rate for comp
π’ Building an agent that needs live web data, the fetch part keeps killing it β score 17
Sources: reddit/r/AIAgents
Building an agent that needs live web data, the fetch part keeps killing it. Put together an agent that pulls live info off the web to answer queries. The agent logic was honestly the easy part. The data fetching keeps breaking. Soon as it needs to scrape google search results or hit anything with a
π’ I think most AI voice agent demos hide the hardest part: the listening layer β score 17
Sources: reddit/r/AIAgents
Everyone shows the cool part of AI voice agents: * realistic voice * smart answer * booking appointment * CRM update * call summary * "wow it sounds human" But the boring listening layer is where a lot of these products break. If the speech-to-text is bad, the agent becomes dumb before the LLM even
Omitted 1 additional developer tools items from the main section; see raw data and source-specific sections below.
Infrastructure & Compute
π’ alexzhang13/rlm β General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes. β score 38
Sources: github_trending
General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.
π’ openobserve/openobserve β Open source observability platform for logs, metrics, traces, frontend monitoring, pipelines and LLM observability. A sophisticated, simple and highly performant alternative to Datadog, Splunk, and Elasticsearch with 140x lower storage costs and single binary deployment. β score 28
Sources: github_trending
Open source observability platform for logs, metrics, traces, frontend monitoring, pipelines and LLM observability. A sophisticated, simple and highly performant alternative to Datadog, Splunk, and Elasticsearch with 140x lower storage costs and single binary deployment.
π’ How do you analyze the relative "strength" of probes? [R] β score 6
Sources: reddit/r/MachineLearning
This question is related to topics like language+ models (including multimodal) and things like "circuit" analyses. I think something related might come up in my work (factuality guarantees for model outputs) and I'm trying to orient to the SoTA. I found [this old post](https://www.neelnanda.io/mech
Research Papers
π’ IndustryBench-MIPU: Benchmarking Multi-Image Attribute Value Extraction for Industrial Products β score 20
Sources: huggingface
Industrial products such as valves and circuit breakers are defined by dense technical specifications that govern procurement, compatibility, and safety across supply chains. These specifications are scattered across multiple heterogeneous product images, including specification tables, nameplates,
Other Signals
π’ CEOs of Anthropic and Google DeepMind call for U.S.-led AI coalition in meeting at G7 β score 32
Sources: reddit/r/LocalLLaMA
https://www.cnbc.com/2026/06/17/anthropic-amodei-google-hassabis-us-ai-coalition-g7.html [https://www.politico.eu/article/ai-artificial-intelligence-anthropic-china-g7/](https://www.politico.eu/article/ai-art
π’ What does provisional paper acceptance mean in ECCV? Is that the default message everyone gets? [D] β score 31
Sources: reddit/r/MachineLearning
What does provisional paper acceptance mean in ECCV? Is that the default message everyone gets?
π’ price rising effect is wild.. β score 18
Sources: reddit/r/LocalLLaMA
https://preview.redd.it/6f2gghbgqy7h1.png?width=2980&format=png&auto=webp&s=eacde26d8d0154aabc9884d4c31607aa12ace68c Q.01 out soon? i dont really need precision after all..
π’ I Emailed 12,000 Businesses About Their Websites. Here's What Happened. β score 17
Sources: reddit/r/AIAgents
A few weeks ago I analyzed around 12,000 business websites and emailed each business explaining the issues I found on their website and why those issues could be hurting their business. The interested reply rate was bouncing between 5% and 9%. I've been having a lot of fun lately automating a proces
π’ I open sourced a vendor-neutral authorization for AI agents. β score 14
Sources: reddit/r/AIAgents
TL;DR: I open-sourced apparitor. It checks every agent action (an MCP tool call, an agent-to-agent invoke, a tool call inside a guardrail) against a policy engine before it runs and blocks the ones that aren't allowed. Fail-closed, works with OPA/Cedar/OpenFGA, Apache-2.0. AI agents act through tool
Omitted 1 additional other signals items from the main section; see raw data and source-specific sections below.
π Trending Repos
| Repo | Description | Stars Today | Language |
|---|---|---|---|
| bytedance/UI-TARS-desktop | The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra | 150 | typescript |
| infiniflow/ragflow | RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs | 105 | python |
| calesthio/OpenMontage | World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio. | 98 | python |
| roboflow/rf-detr | RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning. [ICLR 2026] | 80 | python |
| continuedev/continue | open-source coding agent | 49 | typescript |
| alexzhang13/rlm | General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes. | 43 | python |
| promptfoo/promptfoo | Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic. | 41 | typescript |
| microsoft/RD-Agent | Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. πhttps://aka.ms/RD-Agent-Tech-Report | 32 | python |
| openobserve/openobserve | Open source observability platform for logs, metrics, traces, frontend monitoring, pipelines and LLM observability. A sophisticated, simple and highly performant alternative to Datadog, Splunk, and Elasticsearch with 140x lower storage costs and single binary deployment. | 26 | typescript |
| Lampese/codex-switcher | A Desktop Application for Managing Multiple OpenAI Codex CLI Accounts | 8 | rust |
π New Papers
| Title | Category | Hotness | Link |
|---|---|---|---|
| SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior | research_paper | 13 | Open |
| Native Active Perception as Reasoning for Omni-Modal Understanding | research_paper | 10 | Open |
| Sumi: Open Uniform Diffusion Language Model from Scratch | research_paper | 6 | Open |
| Continuous Audio Thinking for Large Audio Language Models | cs.CL | 0 | Open |
| Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification | cs.CL | 0 | Open |
| SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG | cs.CL | 0 | Open |
| Want Better Synthetic Data? Steer It: Activation Steering for Low-Resource Language Generation | cs.CL | 0 | Open |
| JetFlow: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting | cs.CL | 0 | Open |
| CoreMem: Riemannian Retrieval and Fisher-Guided Distillation for Long-Term Memory in Dialogue Agents | cs.CL | 0 | Open |
| VISUALSKILL: Multimodal Skills for Computer-Use Agents | cs.CL | 0 | Open |
| LLM Parameters for Math Across Languages: Shared or Separate? | cs.CL | 0 | Open |
| Montreal Forced Aligner and the state of speech-to-text alignment in 2026 | cs.CL | 0 | Open |
| Possible or Definite? A Benchmark for Evaluating Diagnostic Uncertainty Preservation in Clinical Text | cs.CL | 0 | Open |
| PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning | cs.CL | 0 | Open |
| Towards Scalable Customization and Deployment of Multi-Agent Systems for Enterprise Applications | cs.CL | 0 | Open |
π¦ Twitter/X Highlights
| Account | Tweet Summary |
|---|---|
| OpenAI | Introducing LifeSciBench, a benchmark for measuring and improving how well AI supports real-world life science research. Developed with 173 scientists from biotechnology and pharmaceutical research, LifeSciBench includes 750 expert-authored tasks across seven biological research workflows. https://o Post |
| xai | Grok is now available on Amazon Bedrock. AWS developers can now build with Grok 4.3, the industry leader in hallucination rate and tool calling, powered by Bedrockβs secure inference engine. Post |
| GoogleDeepMind | Weβre working with @SciTechgovuk, >@mhclg and @i_dot_ai on a new AI housing application planning prototype. π‘ By cutting down the time spent on repetitive tasks, it could help planning officers focus their attention on complex projects and reduce processing times by up to 50%. β https://goo.gle/4xzq Post |
Repeated From Recent Briefings
- Panniantong/Agent-Reach β Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu β one CLI, zero API fees. - first seen 2026-06-06
- farion1231/cc-switch β A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io - first seen 2026-05-08
- Next-Latent Prediction Transformers [R] - first seen 2026-06-17
- google-research/timesfm β TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. - first seen 2026-05-02
- anthropics/skills β Public repository for Agent Skills - first seen 2026-05-11
- rohitg00/ai-engineering-from-scratch β Learn it. Build it. Ship it for others. - first seen 2026-05-21
- Kairos: A Native World Model Stack for Physical AI - first seen 2026-06-16
- earendil-works/pi β AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI - first seen 2026-05-09
- PaddlePaddle/PaddleOCR β Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages. - first seen 2026-05-09
- openai/codex β Lightweight coding agent that runs in your terminal - first seen 2026-05-10
- ... plus 405 more repeated items in processed data