VOID, BiRefNet, and Gemma 4 Are Now in ComfyUI
Three significant open-source models arrived in ComfyUI on May 14, 2026: VOID for video object deletion, BiRefNet for image segmentation, and Gemma 4 for multimodal reasoning.
Three significant open-source models arrived in ComfyUI on May 14, 2026: VOID for video object deletion, BiRefNet for image segmentation, and Gemma 4 for multimodal reasoning.
Open-source Claude Code skill turns one image into a fully meshed 3D scene with separate object meshes, Gaussian splat background, and ambient plus physics audio in under five minutes.
LTX Director v1.3.0 dropped today: a single ComfyUI node that replaces your entire LTX 2.3 toolkit. Timeline editing, prompt relay, 8 keyframes, and custom audio, all in one place.
OpenReader v3.0 converts PDF, EPUB, DOCX, TXT, and Markdown files into synchronized read-along sessions or exported audiobooks, with multiple TTS providers and Docker deployment.
ScenemaAI released Scenema Audio on Hugging Face and GitHub, an open-weights expressive TTS and zero-shot voice cloning model built on the audio half of Lightricks LTX-2. MIT inference code, 13 languages, real-time on a 24 GB GPU.
Cline released @cline/sdk on May 13, 2026, an open-source TypeScript runtime that lets developers embed the same coding agent powering Cline.
Nous Research's Hermes Agent reached 140,000 GitHub stars in three months. NVIDIA featured it for RTX creators who want a local AI agent that writes its own skills.
OpenSquilla v0.1.0 launched May 12, 2026: an Apache 2.0 microkernel runtime that smart-routes agent traffic across model tiers for 60 to 80 percent token savings.
TencentARC Pixal3D generates 3D models that stay pixel-aligned with your input image. Accepted to SIGGRAPH 2026, with a live HuggingFace demo and open-source code.
OpenBMB shipped MiniCPM-V 4.6, a 1.3B Apache 2.0 vision-language model that runs on iPhone, Android, and HarmonyOS and leads the sub-2B open-weights pack.
HiDream-O1-Image is an open-source 8B pixel transformer that beats FLUX.2 Dev at 2K resolution, with MIT license and ComfyUI support.
Moonshot AI closed a 2 billion dollar round at a 20 billion dollar valuation led by Meituan May 7. What the open-weights tier consolidation means for creators.
By mid-2026 ComfyUI hosts thousands of community workflows. Here are 12 that have crossed the production-quality threshold, with download links and the use case each one actually serves.
IBM released Granite 4.1 on April 29, 2026 with dense decoder-only 3B, 8B, and 30B Apache 2.0 LLMs that walk back the Granite 4.0 MoE bet. The 8B dense beats the 32B-A9B MoE on most benchmarks. 128K production context, 512K extension.
Zed Industries shipped Zed 1.0 on April 29 2026, a Rust-built, GPU-accelerated code editor with parallel agents, edit prediction, and the open Agent Client Protocol.
Mistral shipped Medium 3.5 as a 128B dense multimodal model under modified MIT, with cloud Vibe agents and Le Chat Work mode on day one.
NVIDIA released Nemotron 3 Nano Omni on April 28: a 30B open-weight model that handles text, image, video, and audio in one architecture, with joint audio-visual reasoning and multi-hour context.
ComfyOrg announced ComfyStudio on April 28: a Los Angeles VFX and animation residency that pairs working artists with the ComfyUI team and releases every workflow open-source.
Poolside released the first two models in its Laguna family on April 28: a 225B-parameter flagship called M.1 and an open-weight 33B model called XS.2 under Apache 2.0. Both are pitched at agentic, long-horizon coding work.
SenseTime open-sourced SenseNova U1 on April 28, 2026, releasing three model variants under Apache 2.0. The architecture drops VAEs entirely, unifying image generation and text reasoning in one space.
The complete creator guide to ComfyUI in 2026: setup, models, partner nodes, working workflows for image, video, audio, and 3D, the ecosystem, and managed-vs-self-host trade-offs.
The complete creator reference to open-source AI in 2026: LLMs (DeepSeek V4, Kimi, Qwen), image (FLUX, Wan 2.7), video (Wan, Skywork, LTX), 3D (HY-World, TRELLIS), audio (Voicebox, OmniVoice, Darwin-TTS), all license-aware.
ComfyUI v0.20.1 lands April 27 with SUPIR upscaling, RIFE and FILM frame interpolation, SAM 3.1 segmentation, plus 4K Veo and Kling and GPT-Image-2 partner nodes.
DeepSeek V4 Preview ships MIT-licensed 1.6T and 284B MoE models with 1M context at sub-$0.30 per million output tokens. What it means for creators.