On June 3, 2026, Ideogram published the weights and inference code for Ideogram 4, its first open-source text-to-image model. The GitHub release gives any creator or developer direct access to the same model powering the Ideogram API: 9.3 billion parameters, two quantization formats, and support for NVIDIA GPUs and Apple Silicon. ComfyUI v0.24.0 added day-zero support on the same day.
What Happened
Ideogram has been one of the strongest closed-source text-to-image APIs, known for its typography and compositional layout capabilities. Until June 3, access required an API subscription. The new release puts open weights and a reference inference library under a non-commercial license, available to anyone.

The architecture is a fully single-stream Diffusion Transformer (DiT) with 34 transformer layers, trained from scratch using flow-matching rather than fine-tuned from an existing base model. Text understanding comes from Qwen3-VL-8B-Instruct, with hidden states extracted from 13 intermediate layers to give the model strong language grounding.
Two weight variants are on HuggingFace: ideogram-4-fp8, which supports all hardware types including Apple Silicon and integrates with the Diffusers library, and ideogram-4-nf4, which targets CUDA GPUs with native Diffusers support. Both quantizations fit within mid-range workstation VRAM budgets.
What Makes Ideogram 4 Different
Most open-weight image models accept a natural-language prompt and leave spatial interpretation to the model. Ideogram 4 adds a structured JSON prompting layer that makes layout, color, and typography explicit inputs rather than implied constraints.
- Bounding-box layout control: Place elements at specific coordinates using a normalized 0-1000 range. A poster design can pin the product image, headline, and logo to exact positions before generation begins.
- Color palette conditioning: Pass up to 16 hex color codes per image (up to 5 per element). Brand color enforcement happens at inference time, not in post-processing.
- In-image text rendering: Supply literal text strings with separate styling descriptions. Ideogram 4 renders readable typography directly in generated scenes, including signage, labels, and multilingual text.
- Native 2K resolution: Output ranges from 256px to 2048px in multiples of 16, with aspect ratios up to 6:1. No upscaler needed for poster and banner work.
This structured control system reflects how the model was trained: Ideogram built its training data around JSON captions that encode layout intent, giving the model strong priors for compositional scenes that respond reliably to structured inputs.
Benchmark Performance vs. Larger Models
Independent evaluations place Ideogram 4 at the top of the open-weight field, ahead of models two to nine times larger by parameter count.
| Model | Parameters | Typography Win Rate | Design Arena Rank |
|---|---|---|---|
| Ideogram 4 (open weight) | 9.3B | 47.9% | #1 open-weight model |
| Gemini 3.1 (closed source) | undisclosed | 30% | top-3 overall |
| FLUX.2 [dev] | 32B | 15.5% | below Ideogram 4 |
| HunyuanImage 3.0 | 80B MoE | not reported | below Ideogram 4 |
| Qwen-Image | 20B | not reported | below Ideogram 4 |
The typography evaluation was run by ContraLabs using 10 professional designers as judges. Ideogram 4 took first place in 47.9% of head-to-head comparisons. When asked whether they would use the model in real client work, designers rated it 3.55 out of 5, the highest score in the evaluation. On LMArena, Ideogram 4 ranks in the top 5 overall and first among open-weight labs.
How to Run Ideogram 4 Locally
Setup follows four steps using the Ideogram 4 repository. CUDA users should use the nf4 variant; Apple Silicon and other hardware users should use fp8.

- Clone the repository:
git clone https://github.com/ideogram-oss/ideogram4 && cd ideogram4 - Install:
pip install .(orpip install -e .for editable development mode) - Download weights: Use the Diffusers library or HuggingFace CLI to pull the fp8 or nf4 model files
- Run inference:
python run_inference.py --prompt "your prompt" --output out.png --quantization "nf4"
For 2K output at highest quality, add --height 2048 --width 2048 --sampler-preset V4_QUALITY_48. For JSON-structured prompting, pass a JSON object containing elements, colors, and layout keys per the repository documentation.
Using the Diffusers library directly:
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"ideogram-ai/ideogram-4-fp8",
dtype=torch.bfloat16,
device_map="cuda" # or "mps" for Apple Silicon
)
image = pipe("product photo on white background, minimal, 8k").images[0]
image.save("output.png")
Day-Zero ComfyUI Support
The ComfyUI team published native Ideogram 4 support on June 3, synchronized with the model release. ComfyUI v0.24.0 or later is required. Model weights for ComfyUI are available through Comfy-Org/Ideogram-4 on HuggingFace.
Within ComfyUI, the JSON prompting system maps to visual node inputs: color conditioning, bounding-box coordinates, and text overlays are exposed as node parameters rather than raw JSON. Creators who prefer a visual workflow can access Ideogram 4's structured controls without writing JSON manually.
To get started in ComfyUI:
- Update ComfyUI to v0.24.0 or later via the Manager or by pulling the latest repository commit
- Download the Ideogram 4 weights from the Comfy-Org HuggingFace repository
- Place model files in your
models/directory - Load the default Ideogram 4 workflow from ComfyUI's built-in example workflow library
This expands ComfyUI's growing native model support, which previously added partner integrations in v0.23.0 with six partner nodes and 3D Gaussian Splat rendering and the Krea 2 style control partner node.
License and Commercial Use
Ideogram 4 is released under the Ideogram 4 Non-Commercial license. Personal projects, research, and educational use are permitted. Commercial use, including client deliverables, SaaS applications, or any service generating revenue from the model output, requires a separate commercial agreement with Ideogram.
For commercial work today, the Ideogram API remains the correct path. The API handles licensing automatically and uses the same underlying model. The open weights enable local inference for non-commercial workloads and lower the barrier for experimentation without per-image API costs.
Creator Outcomes
For designers working on poster design, branded layouts, or any project requiring readable in-image typography, Ideogram 4 removes the closed-API dependency for non-commercial work. The JSON prompting system makes outputs reproducible: specify the same bounding boxes, hex palette, and text strings and the model generates consistent on-brand results across a batch.

At 9.3B parameters in fp8 quantization, the model fits within mid-tier workstation GPU memory budgets. Compared to FLUX.2 [dev] and other leading image generators: FLUX.2 has 32B parameters and does not offer native bounding-box layout control or structured typography input at this precision level. Ideogram 4 delivers better text rendering at one-third the parameter count.
Teams doing high-volume non-commercial image generation can now run Ideogram 4 on local hardware, control the output programmatically via the JSON prompting system, and avoid per-image API costs entirely.
Frequently Asked Questions
What hardware do I need to run Ideogram 4 locally?
The fp8 variant works on any CUDA GPU or Apple Silicon (MPS) device and integrates with the Diffusers library. The nf4 variant is CUDA-only. At 9.3B parameters with fp8 quantization, typical VRAM requirements fall in the 12-16GB range, though Ideogram has not published official minimum requirements. Check the GitHub repository for community hardware reports as testing expands.
Can I use Ideogram 4 for commercial client work?
Not under the open weights license. The Ideogram 4 Non-Commercial license restricts commercial use. For client deliverables and revenue-generating applications, use the Ideogram API, which includes commercial licensing terms automatically. The open weights are for personal projects, research, and non-commercial purposes only.
How does the JSON prompting system work?
Instead of a single text string, pass a structured JSON object with elements (descriptions plus bounding-box coordinates in a 0-1000 normalized grid), color palettes (up to 16 hex codes per image, 5 per element), and text overlays (literal strings with separate styling descriptions). This mirrors how the model's training data was structured, giving it strong priors for compositional and layout-driven scenes.
Is Ideogram 4 the same as Ideogram 4 Ultra?
No. Ideogram 4 Ultra is a premium API tier on ideogram.ai. The open-weight model released June 3 is the Ideogram 4 base model. It uses the same architecture and training approach but targets local inference hardware budgets rather than the cloud premium tier.
Does ComfyUI expose the JSON prompting controls visually?
Yes. ComfyUI v0.24.0's Ideogram 4 integration exposes color conditioning, bounding-box layout, and text overlay parameters as visual node inputs. No raw JSON required; the nodes handle parameter serialization internally so you interact with standard ComfyUI node connections.
How does Ideogram 4 compare to FLUX models on text rendering?
In the ContraLabs evaluation by 10 professional designers, Ideogram 4 achieved a 47.9% first-place win rate on typography tasks compared to FLUX.2 [dev] at 15.5%. FLUX.2 has 32B parameters versus Ideogram 4's 9.3B, meaning Ideogram 4 produces better readable in-image text at roughly one-third the parameter count and VRAM footprint.