The most downloaded text-to-image model on HuggingFace right now is not from OpenAI, Google, or Midjourney. It is Stable Diffusion XL, a model released in July 2023, still pulling 2.27 million downloads per month in March 2026.

We tracked the top 50 text-to-image models on HuggingFace by downloads and likes, cross-referenced the Artificial Analysis Image Arena ELO rankings, and compared pricing across every major commercial platform. Here is what the numbers actually say about who leads AI image generation in 2026.

1. Open-Source Models Still Dominate Actual Usage

If you look at raw download numbers, the picture is clear: open-weight models crush proprietary ones in real-world adoption. The top 10 most-downloaded text-to-image models on HuggingFace are all open-weight.

Most downloaded text-to-image models on HuggingFace (March 2026)
ModelMonthly DownloadsLikesReleased
SDXL Base 1.02,269,4267,539Jul 2023
SD v1.51,595,4051,049Aug 2024
Z-Image Turbo876,1544,276Nov 2025
SDXL Turbo (Crynux)842,0812Jul 2025
SD Turbo776,718443Nov 2023
FLUX.1 [dev]754,24012,474Jul 2024
FLUX.1 [schnell]709,8394,694Jul 2024
SDXL Turbo607,1262,538Nov 2023
SD v1.4497,6286,985Aug 2022

SDXL alone accounts for more downloads than the next two models combined. And SD v1.4, released nearly four years ago, still pulls nearly half a million downloads monthly. Open-source image generation has deep, sticky infrastructure roots that newer models have not displaced.

2. FLUX Is the Community Favorite (by a Wide Margin)

FLUX.1 [dev] from Black Forest Labs has 12,474 likes on HuggingFace, more than any other text-to-image model. That is 65% more than SDXL (7,539 likes) despite having fewer raw downloads. The FLUX.1 [dev] Space on HuggingFace holds 9,405 likes, making it the fourth most-liked Space on the entire platform.

What separates FLUX from older Stable Diffusion models is its architecture. At 12 billion parameters, it uses rectified flow matching rather than traditional diffusion, which translates to better prompt adherence and fewer inference steps. FLUX.1 [schnell] generates usable images in 1 to 4 steps under an Apache 2.0 license, making it the fastest fully open-source option available.

FLUX 1.1 Pro, the commercial variant, holds the top position on the Artificial Analysis text-to-image leaderboard, outperforming Midjourney 6.1 and Ideogram v2 in blind human preference tests. It generates photorealistic images in 4.5 seconds at $0.04 to $0.05 per image through the API.

3. Alibaba's Z-Image Turbo Is the Breakout Story of 2026

Z-Image Turbo from Alibaba's Tongyi Lab appeared out of nowhere in November 2025 and rocketed to 876,154 monthly downloads, landing third on our list above SDXL Turbo, SD Turbo, and both FLUX variants.

The model runs on just 6 billion parameters but matches closed-source models with 20+ billion. It generates images in 8 inference steps, achieves sub-second latency on H800 GPUs, and runs smoothly on consumer cards with 16GB VRAM. It excels at photorealistic portraits and can accurately render both Chinese and English text. Alibaba followed up in January 2026 with Z-Image-Base, a non-distilled variant that trades speed for maximum output quality at 30 to 50 sampling steps.

With 4,276 likes in under four months, Z-Image Turbo is the fastest-growing open-weight image model since FLUX.

4. The Arena Rankings Tell a Different Story Than Downloads

Downloads measure infrastructure adoption. ELO rankings from Artificial Analysis and LM Arena measure perceived output quality through blind human preference voting. The two metrics diverge sharply.

Top image models by ELO rating (Artificial Analysis / LM Arena, March 2026)
ModelELO ScoreType
GPT Image 1.5 (high)1,265Proprietary
FLUX 1.1 Pro~1,265Proprietary
Nano Banana 2 (Gemini 3.1 Flash)1,258Proprietary
Nano Banana Pro (Gemini 3 Pro)1,214Proprietary
Midjourney v6.1~1,180Proprietary
Ideogram v2~1,160Proprietary

The gap between the top model (1,265) and the ninth-ranked model spans just over 100 ELO points. For most practical creative work, that difference is negligible. The arena is converging: quality is no longer the differentiator it used to be.

5. Pricing Has Fragmented Into Three Tiers

The commercial image generation market has settled into distinct pricing brackets, and the gap between tiers is widening.

Commercial image generation pricing (March 2026)
PlatformEntry PriceBest Value PlanNotes
Midjourney$10/moStandard $30/mo3.3h Fast GPU (Basic), unlimited Relax (Standard+)
DALL-E 3 / GPT Image$0.04/imageChatGPT Plus $20/moGPT Image 1.5 leads arena; DALL-E 3 legacy at $0.04-0.12
Adobe FireflyFree (limited)Standard $9.99/mo2,000 premium credits; unlimited standard gens
IdeogramFree (limited)Plus $15/moStrong text rendering; Pro at $20/mo
FLUX 1.1 Pro$0.04/imageAPI-onlyTop arena score; available via Replicate, fal.ai, Together
SDXL / FLUX [schnell]$0 (self-host)$0 + computeRun locally on consumer GPU; full control

The real story: for anyone with a GPU, the cost of generating a single image has effectively hit zero. The free tier of open-source has caught up to where paid models were 18 months ago.

6. Old Models Refuse to Die

One of the most surprising patterns in the data is the persistence of older models. Stable Diffusion v1.4 (August 2022) still records 497,628 monthly downloads. SD v1.5 pulls 1.59 million. Together, models released before 2024 account for over 60% of total text-to-image downloads on HuggingFace.

This is not nostalgia. These models have massive ecosystems of LoRA adapters, ControlNet integrations, and ComfyUI workflows built on top of them. Switching to a newer architecture means rebuilding that entire stack. For production pipelines and established workflows, the cost of migration outweighs the quality gains.

7. The Spaces Leaderboard Reveals What People Actually Use

HuggingFace Spaces shows interactive demos where users generate images directly. The most-liked image generation Spaces tell us what is capturing real user attention.

Top image-related HuggingFace Spaces by likes
SpaceLikesType
Kolors Virtual Try-On10,011Virtual try-on
FLUX.1 [dev]9,405Text-to-image
DALL-E Mini5,668Text-to-image (legacy)
IllusionDiffusion5,382Optical illusion art
FLUX.1 [schnell]5,046Fast text-to-image
InstantID3,576Face-driven generation

Two things stand out. First, FLUX claims two of the top five image Spaces. Second, the most-liked image Space overall is not a text-to-image generator at all. It is Kolors Virtual Try-On from Kwai, a specialized application that outranks pure generation models. Applied AI tools that solve specific creative problems are attracting more user engagement than general-purpose generators.

Trend Analysis

Rising

  • FLUX ecosystem: Combined 1.46M monthly downloads across dev and schnell, top arena rankings via Pro variant, highest community engagement on HuggingFace. Black Forest Labs is winning both the open-source and commercial fronts simultaneously.
  • Chinese open-weight models: Z-Image Turbo's rapid ascent signals that Alibaba, Tencent, and Baidu are investing heavily in open image generation. Expect more models from Chinese labs targeting the global open-source community in 2026.
  • GPT-native image generation: OpenAI's GPT Image 1.5 leading the arena rankings shows that multimodal LLMs generating images natively (rather than calling a separate diffusion model) is becoming the quality standard for consumers.

Stable

  • SDXL infrastructure: At 2.27M monthly downloads, SDXL is the backbone of production image pipelines worldwide. Its ecosystem is too entrenched to be displaced quickly, even by superior models.
  • Midjourney: Still the default recommendation for non-technical users who want quality without touching a terminal. But the lack of an open model or API flexibility keeps it from competing in the developer ecosystem.
  • Subscription pricing: The $10 to $30/month range has become industry standard. No major platform has undercut this range, and none has raised prices significantly.

Emerging

  • Applied generation tools: Virtual try-on, illusion art, face-driven generation, and other specialized applications are growing faster than general text-to-image. The next wave of image AI is about solving specific creative problems, not raw generation quality.
  • Distilled and turbo models: Z-Image Turbo, SDXL Turbo, and FLUX [schnell] all prioritize speed over maximum quality. For real-time applications and batch processing, 4 to 8 step generation is becoming the new normal.
  • Local-first workflows: With models like FLUX [schnell] running on consumer GPUs and Z-Image Turbo fitting in 16GB VRAM, the shift from cloud APIs to local generation is accelerating among professional creators.

Predictions

  1. FLUX 2.0 will ship before July 2026 and will target 20+ billion parameters with native video frame generation. Black Forest Labs has the funding, the team (ex-Stability AI founders), and the momentum.
  2. At least two more Chinese open-weight image models will crack the HuggingFace top 10 by downloads before the end of 2026. Z-Image Turbo proved the demand exists; Tencent and ByteDance are next.
  3. Midjourney will release an API by Q3 2026. The pressure from FLUX Pro and GPT Image on the developer side, combined with Ideogram and Adobe Firefly on the consumer side, makes this inevitable.
  4. SDXL downloads will still exceed 1 million per month in December 2026. Ecosystem lock-in is that powerful. LoRA libraries, ComfyUI nodes, and production pipelines will keep it relevant for at least another year.
  5. Arena ELO scores for the top 5 models will compress to within 50 points by year-end. Quality convergence is the defining trend, and it will push competition toward speed, price, and specialization instead.

What This Means for Creators

If you are just starting out: Try Adobe Firefly (free tier) or Ideogram (free tier) to learn prompt craft without spending anything. Graduate to Midjourney Standard ($30/mo) when you need consistent quality for client work.

If you need API access: FLUX 1.1 Pro at $0.04/image offers the best quality-to-price ratio available right now. GPT Image 1.5 is the strongest alternative if you are already in the OpenAI ecosystem.

If you want full control: Run FLUX.1 [schnell] locally (Apache 2.0, no restrictions) or Z-Image Turbo if you have 16GB VRAM. Both generate production-quality images at zero marginal cost.

If you have existing SD pipelines: Do not rush to migrate. SDXL and SD 1.5 still work, and the LoRA/ControlNet ecosystem around them has no equivalent on newer architectures yet. When FLUX LoRA support matures, that will be the time to switch.

The bottom line: Quality differences between the top models have shrunk to the point where they barely matter for most creative work. Choose your image generation tool based on price, speed, workflow integration, and control over your pipeline. The era of one model being clearly "the best" is over.

Full Data Summary

All tracked models and key metrics
ModelOrgDownloads/moHF LikesArena ELOLicense
SDXL Base 1.0Stability AI2,269,4267,539N/AOpen (CreativeML)
SD v1.5Community1,595,4051,049N/AOpen (CreativeML)
Z-Image TurboAlibaba Tongyi876,1544,276N/AOpen
FLUX.1 [dev]Black Forest Labs754,24012,474~1,245Non-commercial
FLUX.1 [schnell]Black Forest Labs709,8394,694N/AApache 2.0
SDXL TurboStability AI607,1262,538N/AOpen
SD v1.4CompVis497,6286,985N/AOpen (CreativeML)
GPT Image 1.5OpenAIN/AN/A1,265Proprietary
FLUX 1.1 ProBlack Forest LabsN/AN/A~1,265Proprietary API
Midjourney v6.1MidjourneyN/AN/A~1,180Proprietary

This research was produced by Creative AI News.

Subscribe for free to get the weekly digest every Tuesday.