๐ด High Significance
Model Releases
๐ด Stop wasting electricity โ score 88
Sources: reddit/r/LocalLLaMA
Run on my rtx4090 llama.cpp params: llama-server -m ~/Projects/llm/models/Qwen3.6-27B-UD-Q4_K_XL.gguf --flash-attn on -ngl all -ctk q4_0 -ctv q4_0 -t 32 -c 262144 Power limit was set using sudo nvidia-smi -pl N On my observation, GPU constantly hitting power limit, so its safe to say that it actual
๐ด TabPFN-3 just released: a pre-trained tabular foundation model for up to 1M rows [R][N] โ score 81
Sources: reddit/r/MachineLearning
TabPFN-3 was released today, the next iteration of the tabular foundation model, originally published in Nature. Quick recap for anyone new to TabPFN: TabPFN predicts on tabular data in a single forward pass - no training, no hyperparameter search, no tuning. Built on TabPFN-2.5 (Nov 2025) and TabPF
Developer Tools
๐ด Steam Recommender using similarity! (Undergraduate Student Project) [P] โ score 94
Sources: reddit/r/MachineLearning
(DISCLAIMER: I accidentally deleted the last post on this subreddit my apologies if this is your second time seeing it) Last year I made a post about my steam recommender The last one was great
๐ด Recommendation for agent stack for enterprise content generation โ score 92
Sources: reddit/r/AIAgents
Any recommendation for agent stack for content generation? Looking for end to end agent flow for content generation, compliance and brand review, and persona variation. Currently testing out Writer AI but convinced we can do better for the company. Very interested in hearing what you have done, lear
๐ด Imbad0202/academic-research-skills โ Academic Research Skills for Claude Code: research โ write โ review โ revise โ finalize โ score 72
Sources: github_trending
Academic Research Skills for Claude Code: research โ write โ review โ revise โ finalize
Infrastructure & Compute
๐ด I got a real transformer language model running locally on a stock Game Boy Color! โ score 96
Sources: reddit/r/LocalLLaMA
No phone, PC, Wi-Fi, link cable, or cloud inference. โข The cartridge boots a ROM, and the GBC runs the model itself. โข The model is Andrej Karpathyโs TinyStories-260K, converted to INT8 weights with fixed-point math so it can run without floating point. โข Built with GBDK-2020 as an MBC5 Game Boy ROM
Research Papers
๐ด Debiased Model-based Representations for Sample-efficient Continuous Control โ score 82
Sources: huggingface ยท arxiv/cs.AI
Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training co
๐ด A Causal Language Modeling Detour Improves Encoder Continued Pretraining โ score 78
Sources: huggingface ยท arxiv/cs.AI
When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed by a short MLM decay improves downstream performance. On biomedical texts with ModernBERT, this C
Other Signals
๐ด Needle: We Distilled Gemini Tool Calling Into a 26M Model โ score 88
Sources: reddit/r/LocalLLaMA ยท hackernews
We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices. We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led t
๐ด Dad why is my sisters name Lora? โ score 81
Sources: reddit/r/LocalLLaMA
๐ก Notable
Model Releases
๐ก Let's build claude code from scratch! โ score 65
Sources: reddit/r/LocalLLaMA
So, I made this video about how to create claude code from scratch. Here's the video: https://youtu.be/8pDfgBEy8bg and Github: https://github.com/CohleM/nanoclaude Feedback is extremely appreciated.
๐ก Is using vLLM actually worth it if you aren't serving the model to other people? โ score 50
Sources: reddit/r/LocalLLaMA
So, as most of us here are, I'm a llama.cpp loyalist. Easy to understand, great configuration, relatively stable, etc. But Iโve been increasingly tempted by vLLM, especially since AMD just added it as a built-in inference engine to Lemonade, and I happen to have an AMD GPU. The thing is, I've never
๐ก I think AI businesses are quietly shifting from hype products to operational tools โ score 50
Sources: reddit/r/AIAgents
Last year felt dominated by flashy AI launches: * viral demos * AI agents doing everything * futuristic landing pages * endless โreplace your teamโ messaging But the businesses I keep seeing actually retain users are much less exciting on the surface. They solve boring operational problems: * invoic
Developer Tools
๐ก BerriAI/litellm โ Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM] โ score 61
Sources: github_trending
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]
๐ก google-gemini/gemini-cli โ An open-source AI agent that brings the power of Gemini directly into your terminal. โ score 59
Sources: github_trending
An open-source AI agent that brings the power of Gemini directly into your terminal.
๐ก microsoft/data-formulator โ ๐ช Create rich visualizations with AI โ score 55
Sources: github_trending
๐ช Create rich visualizations with AI
๐ก EveryInc/compound-engineering-plugin โ Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more โ score 51
Sources: github_trending
Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more
๐ก A survey of every open-source "credential vault for AI agents โ score 50
Sources: reddit/r/AIAgents
I went looking for a map of this space a few months ago, could not find one, and eventually wrote it myself. every week someone asks "is anything other than my .env file the right answer for an agent that runs unattended" and the honest reply is "yes, about a dozen things, and they disagree about wh
Omitted 5 additional developer tools items from the main section; see raw data and source-specific sections below.
Infrastructure & Compute
๐ก How do you create memorable poster for top tier conferences ( ICML/ICLR/NEURips ectโฆ) [D] โ score 69
Sources: reddit/r/MachineLearning
Hello everyone, Presenting at a top-tier conference for the first time and having a very hard time coming up with an appropriate design for my poster. Everything I do seems basic and banal. My paper is more theory-oriented, and apart from putting math formulas in bold in the middle, I am not sure wh
๐ก My First Official AI Research Paper Accepted on SSRN โ score 58
Sources: reddit/r/LocalLLaMA
https://preview.redd.it/oz4vpoxdfs0h1.jpg?width=910&format=pjpg&auto=webp&s=fa4c91aad0e3c56850fbfc06099e9c4095712bbd Today, my research paper โStable Training with Adaptive Momentum (STAM)โ was officially accepted on SSRN โ marking my first documented and official publication as
Business & Funding
๐ก Save and invest your money for future rigs โ score 42
Sources: reddit/r/LocalLLaMA
I have long had the itch to build out. Plan was for a 1tb genoa this year and what would have been a $6,000 affair is now a $30,000 affair. So I held out for the mac M5 studio ultra and then looks like Apple is gonna get hit with shortages, they have pushed from Q1 to Q2 and now Q3, meanwhile RTX bl
Research Papers
๐ก Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs โ score 52
Sources: huggingface ยท arxiv/cs.CL
The continued improvements in language model capability have unlocked their widespread use as drivers of autonomous agents, for example in coding or computer use applications. However, the core of these systems has not changed much since early instruction-tuned models like ChatGPT. Even advanced AI
๐ก Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation โ score 52
Sources: huggingface ยท arxiv/cs.LG
We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular valu
๐ก WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting โ score 52
Sources: huggingface ยท arxiv/cs.AI
Recent single-image relighting methods, powered by advanced generative models, have achieved impressive photorealism on synthetic benchmarks. However, their effectiveness in the complex visual landscape of the real world remains largely unverified. A critical gap exists, as current datasets are typi
Other Signals
๐ก Reimagining the mouse pointer for the AI era โ score 50
Sources: hackernews
๐ข Incremental
Model Releases
๐ข I've seen a lot of folks ask "can local LLMs actually do anything useful?" โ score 35
Sources: reddit/r/LocalLLaMA
And I'm here to share my experience. The answer is resoundingly 'yes'. Let me start with the local model I use every day in my AI harness: embedding models. I'm using an embedding model to give my AI's persistent memory system a semantic search protocol that makes its memory recall feel seamless to
๐ข OpenAI API now requires separate pay-as-you-go billing โ what alternatives are you all using? โ score 8
Sources: reddit/r/AIAgents
Hey everyone, Just wanted to flag something in case anyone else got caught off guard like I did. As of now, OpenAI's API is completely separate from any ChatGPT subscription. Even if you're paying $20/month for ChatGPT Plus, that gets you zero API access. They're treated as two entirely independent
๐ข Small local model for questions on German grammar โ score 4
Sources: reddit/r/LocalLLaMA
I'm trying to learn German. I use Qwen3.5/3.6 locally, but this is pretty bad for German grammar. Has anyone got a recommendation for a small-ish local model that knows German grammer well and can answer questions on this?
Developer Tools
๐ข VoltAgent/voltagent โ AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework โ score 34
Sources: github_trending
AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework
๐ข FoundationAgents/MetaGPT โ ๐ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming โ score 26
Sources: github_trending
๐ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
๐ข zhangfengcdt/memoir โ Hierarchical Agent Memory with Git-Like Version Control โ score 24
Sources: github_trending
Hierarchical Agent Memory with Git-Like Version Control
๐ข ICML Visa issues [D] โ score 19
Sources: reddit/r/MachineLearning
Has anyone applying for a Korean visa for ICML been asked for the conferenceโs Business Registration Number? The ICML website explicitly states that it cannot provide the BRC so I wanted to ask how others handled this
๐ข Show HN: Agentic interface for mainframes and COBOL โ score 17
Sources: hackernews
Omitted 1 additional developer tools items from the main section; see raw data and source-specific sections below.
Infrastructure & Compute
๐ข huggingface/transformers โ ๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. โ score 38
Sources: github_trending
๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Other Signals
๐ข I created a minimal one-file implementations (160loc) of JEPA family (ijepa, vjepa, vjepa2, cjepa) for educational purposes [P] โ score 31
Sources: reddit/r/MachineLearning
Hi all, I made my own minimal implementation of JEPA algorithms. Making things minimal and removing all the things needed for scaling the algorithm always helped me understand the essence. So I stripped everything but the algorithm parts. What's left is 160-200 lines of code that distills the essenc
๐ข AntAngelMed - 100a6b Healthcare LLM โ score 19
Sources: reddit/r/LocalLLaMA
๐ข How many of you tried BeeLlama.cpp? How's it? Agentic coding possible with 8GB VRAM? โ score 12
Sources: reddit/r/LocalLLaMA
We'll be getting those features(check bottom link) on mainline soon or later anyway. But for now this fork could be useful to see the full potential of our poor GPUs(and also big, large GPUs). Any 8GB VRAM(and 32GB RAM) folks already doing Agentic coding with models(@ Q4 at least) like Qwen3.6-35B-A
๐ Trending Repos
| Repo | Description | Stars Today | Language |
|---|---|---|---|
| Imbad0202/academic-research-skills | Academic Research Skills for Claude Code: research โ write โ review โ revise โ finalize | 293 | python |
| BerriAI/litellm | Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM] | 147 | python |
| google-gemini/gemini-cli | An open-source AI agent that brings the power of Gemini directly into your terminal. | 105 | typescript |
| microsoft/data-formulator | ๐ช Create rich visualizations with AI | 89 | typescript |
| EveryInc/compound-engineering-plugin | Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more | 72 | typescript |
| memvid/memvid | Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory. | 62 | rust |
| huggingface/transformers | ๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. | 48 | python |
| VoltAgent/voltagent | AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework | 38 | typescript |
| FoundationAgents/MetaGPT | ๐ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming | 34 | python |
| zhangfengcdt/memoir | Hierarchical Agent Memory with Git-Like Version Control | 33 | python |
๐ New Papers
| Title | Category | Hotness | Link |
|---|---|---|---|
| Debiased Model-based Representations for Sample-efficient Continuous Control | research_paper | 6 | Open |
| A Causal Language Modeling Detour Improves Encoder Continued Pretraining | research_paper | 5 | Open |
| Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs | research_paper | 2 | Open |
| Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation | research_paper | 2 | Open |
| WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting | research_paper | 2 | Open |
| A Cascaded Generative Approach for e-Commerce Recommendations | cs.AI | 0 | Open |
| EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales | cs.AI | 0 | Open |
| RankQ: Offline-to-Online Reinforcement Learning via Self-Supervised Action Ranking | cs.AI | 0 | Open |
| OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents | cs.AI | 0 | Open |
| The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes | cs.AI | 0 | Open |
| Don't Look at the Numbers: Visual Anchoring Bias and Layer-wise Representation in VLMs | cs.AI | 0 | Open |
| Do Vision-Language-Models show human-like logical problem-solving capability in point and click puzzle games? | cs.AI | 0 | Open |
| PIVOT: Bridging Planning and Execution in LLM Agents via Trajectory Refinement | cs.AI | 0 | Open |
| Rethinking LLMOps for Fraud and AML: Building a Compliance-Grade LLM Serving Stack | cs.AI | 0 | Open |
| The Semantic Training Gap: Ontology-Grounded Tool Architectures for Industrial AI Agent Systems | cs.AI | 0 | Open |
๐ข Lab Blog Posts
- OpenAI: How finance teams use Codex
Repeated From Recent Briefings
- NousResearch/hermes-agent โ The agent that grows with you - first seen 2026-05-11
- farion1231/cc-switch โ A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io - first seen 2026-05-08
- yikart/AiToEarn โ Let's use AI to Earn! - first seen 2026-05-11
- rohitg00/agentmemory โ #1 Persistent memory for AI coding agents based on real-world benchmarks - first seen 2026-05-09
- tinyhumansai/openhuman โ Your Personal AI super intelligence. Private, Simple and extremely powerful. - first seen 2026-05-11
- millionco/react-doctor โ Your agent writes bad React. This catches it - first seen 2026-05-10
- garrytan/gstack โ Use Garry Tan's exact Claude Code setup: 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA - first seen 2026-05-12
- bytedance/UI-TARS-desktop โ The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra - first seen 2026-05-09
- danielmiessler/Personal_AI_Infrastructure โ Agentic AI Infrastructure for magnifying HUMAN capabilities. - first seen 2026-05-02
- datawhalechina/hello-agents โ ๐ ใไป้ถๅผๅงๆๅปบๆบ่ฝไฝใโโไป้ถๅผๅง็ๆบ่ฝไฝๅ็ไธๅฎ่ทตๆ็จ - first seen 2026-05-09
- ... plus 140 more repeated items in processed data