The most downloaded text-to-image model on HuggingFace right now is not from OpenAI, Google, or Midjourney. It is Stable Diffusion XL, a model released in July 2023, still pulling 2.27 million downloads per month in March 2026.
We tracked the top 50 text-to-image models on HuggingFace by downloads and likes, cross-referenced the Artificial Analysis Image Arena ELO rankings, and compared pricing across every major commercial platform. Here is what the numbers actually say about who leads AI image generation in 2026.
1. Open-Source Models Still Dominate Actual Usage
If you look at raw download numbers, the picture is clear: open-weight models crush proprietary ones in real-world adoption. The top 10 most-downloaded text-to-image models on HuggingFace are all open-weight.
| Model | Monthly Downloads | Likes | Released |
|---|---|---|---|
| SDXL Base 1.0 | 2,269,426 | 7,539 | Jul 2023 |
| SD v1.5 | 1,595,405 | 1,049 | Aug 2024 |
| Z-Image Turbo | 876,154 | 4,276 | Nov 2025 |
| SDXL Turbo (Crynux) | 842,081 | 2 | Jul 2025 |
| SD Turbo | 776,718 | 443 | Nov 2023 |
| FLUX.1 [dev] | 754,240 | 12,474 | Jul 2024 |
| FLUX.1 [schnell] | 709,839 | 4,694 | Jul 2024 |
| SDXL Turbo | 607,126 | 2,538 | Nov 2023 |
| SD v1.4 | 497,628 | 6,985 | Aug 2022 |
SDXL alone accounts for more downloads than the next two models combined. And SD v1.4, released nearly four years ago, still pulls nearly half a million downloads monthly. Open-source image generation has deep, sticky infrastructure roots that newer models have not displaced.
2. FLUX Is the Community Favorite (by a Wide Margin)
FLUX.1 [dev] from Black Forest Labs has 12,474 likes on HuggingFace, more than any other text-to-image model. That is 65% more than SDXL (7,539 likes) despite having fewer raw downloads. The FLUX.1 [dev] Space on HuggingFace holds 9,405 likes, making it the fourth most-liked Space on the entire platform.
What separates FLUX from older Stable Diffusion models is its architecture. At 12 billion parameters, it uses rectified flow matching rather than traditional diffusion, which translates to better prompt adherence and fewer inference steps. FLUX.1 [schnell] generates usable images in 1 to 4 steps under an Apache 2.0 license, making it the fastest fully open-source option available.
FLUX 1.1 Pro, the commercial variant, holds the top position on the Artificial Analysis text-to-image leaderboard, outperforming Midjourney 6.1 and Ideogram v2 in blind human preference tests. It generates photorealistic images in 4.5 seconds at $0.04 to $0.05 per image through the API.
3. Alibaba's Z-Image Turbo Is the Breakout Story of 2026
Z-Image Turbo from Alibaba's Tongyi Lab appeared out of nowhere in November 2025 and rocketed to 876,154 monthly downloads, landing third on our list above SDXL Turbo, SD Turbo, and both FLUX variants.
The model runs on just 6 billion parameters but matches closed-source models with 20+ billion. It generates images in 8 inference steps, achieves sub-second latency on H800 GPUs, and runs smoothly on consumer cards with 16GB VRAM. It excels at photorealistic portraits and can accurately render both Chinese and English text. Alibaba followed up in January 2026 with Z-Image-Base, a non-distilled variant that trades speed for maximum output quality at 30 to 50 sampling steps.
With 4,276 likes in under four months, Z-Image Turbo is the fastest-growing open-weight image model since FLUX.
4. The Arena Rankings Tell a Different Story Than Downloads
Downloads measure infrastructure adoption. ELO rankings from Artificial Analysis and LM Arena measure perceived output quality through blind human preference voting. The two metrics diverge sharply.
| Model | ELO Score | Type |
|---|---|---|
| GPT Image 1.5 (high) | 1,265 | Proprietary |
| FLUX 1.1 Pro | ~1,265 | Proprietary |
| Nano Banana 2 (Gemini 3.1 Flash) | 1,258 | Proprietary |
| Nano Banana Pro (Gemini 3 Pro) | 1,214 | Proprietary |
| Midjourney v6.1 | ~1,180 | Proprietary |
| Ideogram v2 | ~1,160 | Proprietary |
The gap between the top model (1,265) and the ninth-ranked model spans just over 100 ELO points. For most practical creative work, that difference is negligible. The arena is converging: quality is no longer the differentiator it used to be.
5. Pricing Has Fragmented Into Three Tiers
The commercial image generation market has settled into distinct pricing brackets, and the gap between tiers is widening.
| Platform | Entry Price | Best Value Plan | Notes |
|---|---|---|---|
| Midjourney | $10/mo | Standard $30/mo | 3.3h Fast GPU (Basic), unlimited Relax (Standard+) |
| DALL-E 3 / GPT Image | $0.04/image | ChatGPT Plus $20/mo | GPT Image 1.5 leads arena; DALL-E 3 legacy at $0.04-0.12 |
| Adobe Firefly | Free (limited) | Standard $9.99/mo | 2,000 premium credits; unlimited standard gens |
| Ideogram | Free (limited) | Plus $15/mo | Strong text rendering; Pro at $20/mo |
| FLUX 1.1 Pro | $0.04/image | API-only | Top arena score; available via Replicate, fal.ai, Together |
| SDXL / FLUX [schnell] | $0 (self-host) | $0 + compute | Run locally on consumer GPU; full control |
The real story: for anyone with a GPU, the cost of generating a single image has effectively hit zero. The free tier of open-source has caught up to where paid models were 18 months ago.
6. Old Models Refuse to Die
One of the most surprising patterns in the data is the persistence of older models. Stable Diffusion v1.4 (August 2022) still records 497,628 monthly downloads. SD v1.5 pulls 1.59 million. Together, models released before 2024 account for over 60% of total text-to-image downloads on HuggingFace.
This is not nostalgia. These models have massive ecosystems of LoRA adapters, ControlNet integrations, and ComfyUI workflows built on top of them. Switching to a newer architecture means rebuilding that entire stack. For production pipelines and established workflows, the cost of migration outweighs the quality gains.
7. The Spaces Leaderboard Reveals What People Actually Use
HuggingFace Spaces shows interactive demos where users generate images directly. The most-liked image generation Spaces tell us what is capturing real user attention.
| Space | Likes | Type |
|---|---|---|
| Kolors Virtual Try-On | 10,011 | Virtual try-on |
| FLUX.1 [dev] | 9,405 | Text-to-image |
| DALL-E Mini | 5,668 | Text-to-image (legacy) |
| IllusionDiffusion | 5,382 | Optical illusion art |
| FLUX.1 [schnell] | 5,046 | Fast text-to-image |
| InstantID | 3,576 | Face-driven generation |
Two things stand out. First, FLUX claims two of the top five image Spaces. Second, the most-liked image Space overall is not a text-to-image generator at all. It is Kolors Virtual Try-On from Kwai, a specialized application that outranks pure generation models. Applied AI tools that solve specific creative problems are attracting more user engagement than general-purpose generators.
Trend Analysis
Rising
- FLUX ecosystem: Combined 1.46M monthly downloads across dev and schnell, top arena rankings via Pro variant, highest community engagement on HuggingFace. Black Forest Labs is winning both the open-source and commercial fronts simultaneously.
- Chinese open-weight models: Z-Image Turbo's rapid ascent signals that Alibaba, Tencent, and Baidu are investing heavily in open image generation. Expect more models from Chinese labs targeting the global open-source community in 2026.
- GPT-native image generation: OpenAI's GPT Image 1.5 leading the arena rankings shows that multimodal LLMs generating images natively (rather than calling a separate diffusion model) is becoming the quality standard for consumers.
Stable
- SDXL infrastructure: At 2.27M monthly downloads, SDXL is the backbone of production image pipelines worldwide. Its ecosystem is too entrenched to be displaced quickly, even by superior models.
- Midjourney: Still the default recommendation for non-technical users who want quality without touching a terminal. But the lack of an open model or API flexibility keeps it from competing in the developer ecosystem.
- Subscription pricing: The $10 to $30/month range has become industry standard. No major platform has undercut this range, and none has raised prices significantly.
Emerging
- Applied generation tools: Virtual try-on, illusion art, face-driven generation, and other specialized applications are growing faster than general text-to-image. The next wave of image AI is about solving specific creative problems, not raw generation quality.
- Distilled and turbo models: Z-Image Turbo, SDXL Turbo, and FLUX [schnell] all prioritize speed over maximum quality. For real-time applications and batch processing, 4 to 8 step generation is becoming the new normal.
- Local-first workflows: With models like FLUX [schnell] running on consumer GPUs and Z-Image Turbo fitting in 16GB VRAM, the shift from cloud APIs to local generation is accelerating among professional creators.
Predictions
- FLUX 2.0 will ship before July 2026 and will target 20+ billion parameters with native video frame generation. Black Forest Labs has the funding, the team (ex-Stability AI founders), and the momentum.
- At least two more Chinese open-weight image models will crack the HuggingFace top 10 by downloads before the end of 2026. Z-Image Turbo proved the demand exists; Tencent and ByteDance are next.
- Midjourney will release an API by Q3 2026. The pressure from FLUX Pro and GPT Image on the developer side, combined with Ideogram and Adobe Firefly on the consumer side, makes this inevitable.
- SDXL downloads will still exceed 1 million per month in December 2026. Ecosystem lock-in is that powerful. LoRA libraries, ComfyUI nodes, and production pipelines will keep it relevant for at least another year.
- Arena ELO scores for the top 5 models will compress to within 50 points by year-end. Quality convergence is the defining trend, and it will push competition toward speed, price, and specialization instead.
What This Means for Creators
If you are just starting out: Try Adobe Firefly (free tier) or Ideogram (free tier) to learn prompt craft without spending anything. Graduate to Midjourney Standard ($30/mo) when you need consistent quality for client work.
If you need API access: FLUX 1.1 Pro at $0.04/image offers the best quality-to-price ratio available right now. GPT Image 1.5 is the strongest alternative if you are already in the OpenAI ecosystem.
If you want full control: Run FLUX.1 [schnell] locally (Apache 2.0, no restrictions) or Z-Image Turbo if you have 16GB VRAM. Both generate production-quality images at zero marginal cost.
If you have existing SD pipelines: Do not rush to migrate. SDXL and SD 1.5 still work, and the LoRA/ControlNet ecosystem around them has no equivalent on newer architectures yet. When FLUX LoRA support matures, that will be the time to switch.
The bottom line: Quality differences between the top models have shrunk to the point where they barely matter for most creative work. Choose your image generation tool based on price, speed, workflow integration, and control over your pipeline. The era of one model being clearly "the best" is over.
Full Data Summary
| Model | Org | Downloads/mo | HF Likes | Arena ELO | License |
|---|---|---|---|---|---|
| SDXL Base 1.0 | Stability AI | 2,269,426 | 7,539 | N/A | Open (CreativeML) |
| SD v1.5 | Community | 1,595,405 | 1,049 | N/A | Open (CreativeML) |
| Z-Image Turbo | Alibaba Tongyi | 876,154 | 4,276 | N/A | Open |
| FLUX.1 [dev] | Black Forest Labs | 754,240 | 12,474 | ~1,245 | Non-commercial |
| FLUX.1 [schnell] | Black Forest Labs | 709,839 | 4,694 | N/A | Apache 2.0 |
| SDXL Turbo | Stability AI | 607,126 | 2,538 | N/A | Open |
| SD v1.4 | CompVis | 497,628 | 6,985 | N/A | Open (CreativeML) |
| GPT Image 1.5 | OpenAI | N/A | N/A | 1,265 | Proprietary |
| FLUX 1.1 Pro | Black Forest Labs | N/A | N/A | ~1,265 | Proprietary API |
| Midjourney v6.1 | Midjourney | N/A | N/A | ~1,180 | Proprietary |
This research was produced by Creative AI News.
Subscribe for free to get the weekly digest every Tuesday.