๐Ÿ”ด High Significance

Model Releases

๐Ÿ”ด Donate your coding sessions to an open CC-BY-4.0 dataset to help train open-weight and open source models โ€” score 97 Sources: reddit/r/LocalLLaMA

Anthropic and Open AI are getting so much data from the Claude Code and Codex usage, and I'm quite scared this will create an oligopoly because only their models will be trained on it, leaving the open-weight and open source models behind. So I'm trying to launch a little initiative called Trace Com

๐Ÿ”ด [ECCV 2026] Final Decisions [D] โ€” score 81 Sources: reddit/r/MachineLearning

ECCV 2026 final decisions are expected to be released on June 17, 2026. Since there was no exact release time specified, results will likely roll out within 48 hours. This thread is for everyone to share updates, discuss outcomes, and support each other through the decisions. Good luck to everyo

๐Ÿ”ด Mistral - New family of open-weight models @ July โ€” score 70 Sources: reddit/r/LocalLLaMA

Tweet : https://xcancel.com/arthurmensch/status/2066913353860018596#m

Developer Tools

๐Ÿ”ด What guardrails are you using around agent tool calls? โ€” score 72 Sources: reddit/r/AIAgents

For people building agents that actually call tools/APIs: where are you putting the execution boundary? The framing I keep coming back to is: treat the LLM as an untrusted planner. The risky part is not only that the model says something wrong. It is what happens when that plan becomes a real tool c

Research Papers

๐Ÿ”ด ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining โ€” score 85 Sources: huggingface

Vision-Language-Action (VLA) models benefit from large-scale and diverse embodied data, yet scaling robot trajectory collection is costly and labor-intensive. Recent advances show that large-scale egocentric human videos provide complementary real-world supervision in pretraining. However, joint tra

๐Ÿ”ด LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling โ€” score 82 Sources: huggingface ยท arxiv/cs.AI

Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window atten

๐Ÿ”ด Learning from the Self-future: On-policy Self-distillation for dLLMs โ€” score 75 Sources: huggingface

On-policy self-distillation (OPSD) has proven effective for post-training large language models (LLMs), yet its application to diffusion LLMs (dLLMs) remains unexplored. Existing OPSD methods are inherently autoregressive-centric. They inject privileged information via left-to-right prefix condition

Other Signals

๐Ÿ”ด GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available โ€” score 90 Sources: reddit/r/LocalLLaMA

From Source: GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available. It also beats Gemini, making it a frontier-level model for a fraction of the cost. Open weights is back. This model is a game changer. Source: [Cline](https://x.co

๐Ÿ”ด zai-org/GLM-5.2 is here! โ€” score 83 Sources: reddit/r/LocalLLaMA

๐Ÿ”ด Has AI already killed self-help nonfiction books? โ€” score 83 Sources: hackernews

๐Ÿ”ด Hashicorp founder thinks local models "aren't good ENOUGH yet" โ€” score 77 Sources: reddit/r/LocalLLaMA

Generally, respect him a lot, but this is a wrong take. More than 1 year ppl are doing alright using SLMs for coding; only vibecoders might struggle Link

๐ŸŸก Notable

Model Releases

๐ŸŸก GLM-5.2 is now 1st on Design Arena โ€” ahead of the now unavailable Claude Fable 5. โ€” score 63 Sources: reddit/r/LocalLLaMA

https://x.com/Designarena/status/2066940737011560652

๐ŸŸก The agent conversation everyone's skipping: sprawl, shadow agents, and the bill that's coming โ€” score 61 Sources: reddit/r/AIAgents

Everyone's talking about AI agents. Very few are talking about agent sprawl. Over the past few weeks we've been comparing notes with people at a bunch of B2B companies rolling out agents across sales, marketing, prod, eng, support, you name it. The same patterns keep coming up: โ€ข Agents getting buil

๐ŸŸก GPTโ€‘NL: a sovereign language model for the Netherlands โ€” score 50 Sources: hackernews

๐ŸŸก @xai: Grok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations ๐Ÿงต http://grok.com/imagine โ€” score 50 Sources: twitter_rss

Grok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations ๐Ÿงต http://grok.com/imagine

๐ŸŸก GLM 5.2 API is live, weights are on HF, and ollama has it already โ€” score 43 Sources: reddit/r/LocalLLaMA

GLM 5.2 dropped on Friday locked behind the GLM Coding Plan. That was annoying if you just wanted to test it without subscribing to another IDE tier. Two hours ago today they opened the API and pushed weights to HuggingFace under MIT. Ollama already has it. So now you can actually run it locally or

Developer Tools

๐ŸŸก ParthJadhav/app-store-screenshots โ€” end to end app store screenshot creation using AI โ€” score 57 Sources: github_trending

end to end app store screenshot creation using AI

๐ŸŸก nocobase/nocobase โ€” NocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability. โ€” score 52 Sources: github_trending

NocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability.

๐ŸŸก Gitlawb/openclaude โ€” runs anywhere. uses anything โ€” score 47 Sources: github_trending

runs anywhere. uses anything

Infrastructure & Compute

๐ŸŸก microsoft/fara โ€” Fara-7B: An Efficient Agentic Model for Computer Use โ€” score 60 Sources: github_trending

Fara-7B: An Efficient Agentic Model for Computer Use

๐ŸŸก Next-Latent Prediction Transformers [R] โ€” score 56 Sources: reddit/r/MachineLearning

Microsoft Research Preprint Next-token prediction is myopic. What if transformers learn to predict their own next latent state? Microsoft Research present **Next-Latent

๐ŸŸก What is Speculative Decoding? (trending on paperswithco.de) [R] โ€” score 44 Sources: reddit/r/MachineLearning

A method that is currently trending on Papers with Code is Speculative Decoding. https://preview.redd.it/dm4nh4t71o7h1.png?width=3082&format=png&auto=webp&s=b6468668667d4bcfb6c9248d3af7fd09f21fe0da Speculative decoding is an inference optimization technique

Research Papers

๐ŸŸก Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion โ€” score 65 Sources: huggingface

Pixel-space diffusion models are trained on full-bandwidth noisy images, yet the useful signal available to the denoiser is strongly frequency dependent. Under rectified-flow diffusion and natural-image power-law spectra, the per-band data-to-noise contour k^{*}(t) = (1-t)^{-2/ฮฑ} separates a signal-

๐ŸŸก ProCUA-SFT Technical Report โ€” score 52 Sources: huggingface ยท arxiv/cs.LG

Training computer-use agents (CUAs) -- models that interact with graphical desktops through screenshots and keyboard/mouse actions -- requires large-scale, diverse trajectory data collected in full desktop environments. The largest public resource, AgentNet (22.5K human trajectories), leads to negat

๐ŸŸก ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions โ€” score 45 Sources: huggingface

Large language models perform increasingly well on standardized logical reasoning benchmarks, but whether this ability remains robust beyond English is unclear. We introduce ChLogic, an English--Chinese aligned benchmark that tests whether models preserve logical reasoning performance when the same

Other Signals

๐ŸŸก Is Le Gros Chaton opensource? โ€” score 57 Sources: reddit/r/LocalLLaMA

so i keep hearing about le gros chaton, the upcoming mistral model that allegedly destroys claude mythos, gpt-5.5, my sleep schedule, and possibly the french economy. people say it has 1b context, self-improves in real time, writes perfect code, and only hallucinates in elegant parisian metaphors. t

๐ŸŸก GLM-5.2 (max) is currently the third best model available, across both open and proprietary. โ€” score 50 Sources: reddit/r/LocalLLaMA

๐ŸŸก Unlocking UK house-building with AI-accelerated planning โ€” score 50 Sources: lab_blog/DeepMind

UK government partners with Google DeepMind to build a new AI-powered prototype aimed at faster housing decisions.

๐ŸŸข Incremental

Model Releases

๐ŸŸข Sharing my DIY AI Memory Framework: Giving LLMs human-like memory (and slashing token costs by 90%) โ€” score 28 Sources: reddit/r/AIAgents

Hi everyone, Like many of you, AI has become an absolute necessity in my daily life. Whether Iโ€™m working on game development, writing, or just brainstorming ideas, trying to do it without an LLM now feels like going back to the Stone Age. However, as I used these tools more and more, I kept hitting

๐ŸŸข Looking for an open-source/free alternative to Eigent for Word document template migration (Gemini API) โ€” score 28 Sources: reddit/r/AIAgents

โ€‹Hi everyone, โ€‹I am looking for some architectural guidance on setting up a local AI agent workflow on my laptop for document editing and formatting automation. I am not an expert and have very basic knowledge about AI related stuff. I am kinda new to this thing. So I am here seeking expert's advise

๐ŸŸข It looks like Rio 3.5 397B could've simply been a semi-failed embezzling of funding โ€” score 23 Sources: reddit/r/LocalLLaMA

Here is the chain of events: 1. The model training received funding of R$500K (about $100K USD). 2. The [initial model documentation](https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/blob/0e1bf540744675baac21d6be4

๐ŸŸข Cheapest way to run GLM 5.x locally that's not a unified memory system? โ€” score 17 Sources: reddit/r/LocalLLaMA

This is primarily an exercise to determine the possible options, obscure as they might be, to run at least a 4bit quant (let's say roughly IQ4_XS). 1. Got a CPU only setup? Please share your experience. Sapphire Rapids ES 56core + DDR5 might be an option 2. Multi GPU setups with partial or complete

๐ŸŸข Qwen-Robot Suite: A Foundation Model Suite for Physical World Intelligence โ€” score 17 Sources: hackernews

Omitted 1 additional model releases items from the main section; see raw data and source-specific sections below.

Developer Tools

๐ŸŸข looking for agents to spam my platform with food ads. Any help appreciated. โ€” score 28 Sources: reddit/r/AIAgents

I have a platform I want to use as a marketplace. But, first, I need to test the speed and the way agents can buy and sell items on the platform. So, if you have an agent, and want to get some action, let me know.

๐ŸŸข Looking for best AI phone agent with CRM integration โ€” score 28 Sources: reddit/r/AIAgents

I run a small service based business and we're constantly missing calls while out on jobs or after hours. When that happens, it often takes longer than it should to get back to potential customers. The biggest challenges are missed calls, slow response times, and having to manually update our CRM af

๐ŸŸข Mel AI just shared a demo of video-native AI characters that can talk, react, and respond to camera context in real time [N] โ€” score 12 Sources: reddit/r/MachineLearning

Character AI, founded by former Google/LaMDA developers Noam Shazeer and Daniel De Freitas, proved that text-based character chat can work as a real entertainment category. But the next chapter might not be better text chat. It might be real-time video interaction. Mel AI recently shared a demo of A

๐ŸŸข tracel-ai/burn โ€” Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability. โ€” score 3 Sources: github_trending

Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.

Business & Funding

๐ŸŸข Zhipu surges 33% as Wall Street raises bets on China AI after Anthropic curbs โ€” score 37 Sources: reddit/r/LocalLLaMA

Research Papers

๐ŸŸข Variable-Width Transformers โ€” score 25 Sources: huggingface

Scaling model size, specifically depth and width, has driven significant progress in transformer-based language models. However, most architectures maintain a constant width across all layers, allocating a fixed parameter and computation budget evenly despite different layers potentially playing dis

๐ŸŸข Text-Vision Co-Instructed Image Editing โ€” score 10 Sources: huggingface

Existing image editing methods can be generally categorized into textual instruction-based and visual prompt-based ones. Textual instructions are semantically expressive, but are limited by the coarse granularity of spatial control of the editing results. In contrast, visual prompts such as drag and

๐ŸŸข MotionVLA: Vision-Language-Action Model for Humanoid Motion โ€” score 10 Sources: huggingface

Generating realistic humanoid motion from scene images and text involves both low-frequency pose semantics and high-frequency physical dynamics. However, many existing methods tokenize motion with a single shared codebook, forcing heterogeneous motion signals into the same quantization space. Our fr

Other Signals

๐ŸŸข VibeThinker-3B: what is this witchcraft? Killing it at MathQA like it has ~30B parameters โ€” score 30 Sources: reddit/r/LocalLLaMA

๐ŸŸข I am testing for a eval harness system and need it to break โ€” score 28 Sources: reddit/r/AIAgents

Long story short: I have extra cloud capacity right now to focus on one very specific use case: LLM as a Judge. I did some stress tests but it's not enough. Hope to get real world firehose if possible. It's API based and no cost. Anyone interested in getting LLM evals in large scale?

๐ŸŸข I built a leakage-clean verifier for robot manipulation, is this useful? Am I solving a non-problem? [D] โ€” score 11 Sources: reddit/r/MachineLearning

Spent the last few weeks on a benchmark/harness that tries to answer one question honestly: did a robot arm actually do the demonstrated task, or did the success metric just get fooled? The setup: compile a human demo into an object-centric graph (what changed in the world: relations, contacts, ev

๐ŸŸข GLM-5.2: Built for Long-Horizon Tasks โ€” score 3 Sources: reddit/r/LocalLLaMA

RepoDescriptionStars TodayLanguage
microsoft/faraFara-7B: An Efficient Agentic Model for Computer Use97python
ParthJadhav/app-store-screenshotsend to end app store screenshot creation using AI94typescript
nocobase/nocobaseNocoBase is an open-source AI + no-code platform for building business systems fast. Instead of generating everything from scratch, AI works on top of production-proven infrastructure and a WYSIWYG no-code interface, so you get both speed and reliability.87typescript
Gitlawb/openclauderuns anywhere. uses anything74typescript
tracel-ai/burnBurn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.4rust

๐Ÿ“„ New Papers

TitleCategoryHotnessLink
ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretrainingresearch_paper35Open
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scalingresearch_paper61Open
Learning from the Self-future: On-policy Self-distillation for dLLMsresearch_paper18Open
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusionresearch_paper15Open
ProCUA-SFT Technical Reportresearch_paper4Open
Beyond Parallel Sampling: Diverse Query Initialization for Agentic Searchcs.AI0Open
When Rules Learn: A Self-Evolving Agent for Legal Case Retrievalcs.AI0Open
SkillChain-Gym: A Benchmark for Reskilling-Aware Production-Inventory Control under Disruptionscs.AI0Open
Skill-Constrained Model Predictive Control for Resilient Manufacturing Supply Chainscs.AI0Open
Nothing from Something: Can a Language Model Discover 0?cs.AI0Open
Quantifying Consistency in LLM Logical Reasoning via Structural Uncertaintycs.AI0Open
MemTrace: Probing What Final Accuracy Misses in Long-Term Memorycs.AI0Open
SpeechDx: A Multi-Task Benchmark for Clinical Speech AIcs.AI0Open
Distributed General-Purpose Agent Networks: Architecture, Key Mechanisms, and Prototypescs.AI0Open
Treatment Response Optimized Clinical Decision Support AI System via Digital Twin Simulationcs.AI0Open

๐Ÿข Lab Blog Posts

๐Ÿฆ Twitter/X Highlights

AccountTweet Summary
xaiGrok Imagine Video 1.5 is here Our new image-to-video model with sharper realism, better physics and faster generations ๐Ÿงต http://grok.com/imagine Post

Repeated From Recent Briefings