๐ด High Significance
Model Releases
๐ด Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items โ score 95
Sources: huggingface
Recent advances in image generation and editing have opened new opportunities for virtual try-on. However, existing methods still struggle to meet complex real-world demands. We present Tstars-Tryon 1.0, a commercial-scale virtual try-on system that is robust, realistic, versatile, and highly effici
Developer Tools
๐ด AgentSPEX: An Agent SPecification and EXecution Language โ score 85
Sources: huggingface
Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestrat
๐ด CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation โ score 75
Sources: huggingface
Synthesizing human--object interaction (HOI) videos has broad practical value in e-commerce, digital advertising, and virtual marketing. However, current diffusion models, despite their photorealistic rendering capability, still frequently fail on (i) the structural stability of sensitive regions su
๐ก Notable
Model Releases
๐ก SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing โ score 65
Sources: huggingface
Traditional photographic image editing typically requires users to possess sufficient aesthetic understanding to provide appropriate instructions for adjusting image quality and camera parameters. However, this paradigm relies on explicit human instruction of aesthetic intent, which is often ambiguo
๐ก Making ChatGPT better for clinicians โ score 50
Sources: lab_blog/OpenAI
OpenAI makes ChatGPT for Clinicians free for verified U.S. physicians, nurse practitioners, and pharmacists, supporting clinical care, documentation, and research.
๐ก Workspace agents โ score 50
Sources: lab_blog/OpenAI
Learn how to build, use, and scale workspace agents in ChatGPT to automate repeatable workflows, connect tools, and streamline team operations.
๐ก Introducing workspace agents in ChatGPT โ score 50
Sources: lab_blog/OpenAI
Workspace agents in ChatGPT are Codex-powered agents that automate complex workflows, run in the cloud, and help teams scale work across tools securely.
๐ก Introducing OpenAI Privacy Filter โ score 50
Sources: lab_blog/OpenAI
OpenAI Privacy Filter is an open-weight model for detecting and redacting personally identifiable information (PII) in text with state-of-the-art accuracy
Omitted 1 additional model releases items from the main section; see raw data and source-specific sections below.
Developer Tools
๐ก AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model โ score 55
Sources: huggingface
Sparse-view 3D reconstruction is essential for modeling scenes from casual captures, but remain challenging for non-generative reconstruction. Existing diffusion-based approaches mitigates this issues by synthesizing novel views, but they often condition on only one or two capture frames, which rest
๐ก Speeding up agentic workflows with WebSockets in the Responses API โ score 50
Sources: lab_blog/OpenAI
A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency.
Infrastructure & Compute
๐ก Decoupled DiLoCo: A new frontier for resilient, distributed AI training โ score 50
Sources: lab_blog/DeepMind
๐ข Incremental
Developer Tools
๐ข ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning โ score 35
Sources: huggingface
Parameter-efficient fine-tuning (PEFT) reduces the training cost of full-parameter fine-tuning for large language models (LLMs) by training only a small set of task-specific parameters while freezing the pretrained backbone. However, existing approaches, such as Low-Rank Adaptation (LoRA), achieve a
๐ข PlayCoder: Making LLM-Generated GUI Code Playable โ score 25
Sources: huggingface
Large language models (LLMs) have achieved strong results in code generation, but their ability to generate GUI applications, especially games, remains insufficiently studied. Existing benchmarks mainly evaluate correctness through test cases, which are inadequate for GUI applications because these
๐ข Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language โ score 15
Sources: huggingface
At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllability. However, in current practice, such workflows are almost entirely constructed through manual engineering: developers must carefully desi
๐ข CityRAG: Stepping Into a City via Spatially-Grounded Video Generation โ score 5
Sources: huggingface
We address the problem of generating a 3D-consistent, navigable environment that is spatially grounded: a simulation of a real location. Existing video generative models can produce a plausible sequence that is consistent with a text (T2V) or image (I2V) prompt. However, the capability to reconstruc
๐ New Papers
| Title | Category | Score | Link |
|---|---|---|---|
| Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items | model_release | 256 | Open |
| AgentSPEX: An Agent SPecification and EXecution Language | developer_tool | 165 | Open |
| CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation | developer_tool | 91 | Open |
| SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing | model_release | 49 | Open |
| AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model | developer_tool | 43 | Open |
| Planetary Exploration 3.0: A Roadmap for Software-Defined, Radically Adaptive Space Systems | cs.AI | 0 | Open |
| Auditing and Controlling AI Agent Actions in Spreadsheets | cs.AI | 0 | Open |
| Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents | cs.AI | 0 | Open |
| Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning | cs.AI | 0 | Open |
| Meta Additive Model: Interpretable Sparse Learning With Auto Weighting | cs.AI | 0 | Open |
| On the Stability and Generalization of First-order Bilevel Minimax Optimization | cs.AI | 0 | Open |
| Not All Memories Age the Same: Autodiscovery of Adaptive Decay in Knowledge Graphs | cs.AI | 0 | Open |
| Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring | cs.AI | 0 | Open |
| Absorber LLM: Harnessing Causal Synchronization for Test-Time Training | cs.AI | 0 | Open |
| EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation | cs.AI | 0 | Open |
๐ข Lab Blog Posts
- OpenAI: Making ChatGPT better for clinicians
- OpenAI: Workspace agents
- OpenAI: Speeding up agentic workflows with WebSockets in the Responses API
- OpenAI: Introducing workspace agents in ChatGPT
- OpenAI: Introducing OpenAI Privacy Filter
- DeepMind: Decoupled DiLoCo: A new frontier for resilient, distributed AI training