MegaStyle: FLUX-Based Style Transfer From 1.4M Images

Researchers from Tongji University, Tencent, and five other institutions released MegaStyle, a 1.4-million image dataset purpose-built for style transfer alongside a FLUX-based model that applies artistic styles to new images. The dataset provides 170,000 style prompts combined with 400,000 content prompts, creating up to 68 billion potential training pairs.

What Happened

MegaStyle addresses a core problem in AI style transfer: existing datasets are too small, inconsistent in style labeling, or lack diversity. The team built a scalable data curation pipeline that uses text-to-image models to generate images matching specific style descriptions, drawing source material from JourneyDB (1M images), WikiArt (80K), and LAION-Aesthetics (1M).

The project ships two tools. MegaStyle-FLUX is a diffusion model trained on the full dataset that takes a reference style image and applies it to new content. MegaStyle-Encoder is a style-specialized image encoder fine-tuned with contrastive learning for measuring style similarity and retrieving matching styles.

Why It Matters

Style transfer has been possible for years, but quality and consistency have lagged behind other generative AI capabilities. MegaStyle's approach of building a massive, structured dataset first and then training models on it produces measurably better results. The encoder achieves 87.26 mAP@1 on the StyleRetrieval benchmark, with 97.61 Recall@10 for finding similar styles.

For designers and illustrators, the FLUX-based model means applying an artistic style from one reference image to new content with higher fidelity than current alternatives. The encoder adds the ability to search large image collections by visual style rather than just by content or keywords.

Key Details

Dataset: 1.4M images across 170K style categories, with intra-style consistency and inter-style diversity verified at scale
MegaStyle-FLUX: Concatenates reference style tokens with noisy image tokens and text inputs in the MM-DiT backbone for style-conditioned generation
MegaStyle-Encoder: Style-supervised contrastive learning (SSCL) produces embeddings that capture style independently from content
Contributors: Tongji University, Tencent, NTU Singapore, HKUST, Fuzhou University, HKU, NUS

What to Do Next

The full research paper details the dataset construction pipeline and benchmark results. The project page provides visual comparisons against existing style transfer methods. Creators working with FLUX-based workflows should watch for code and model weight releases, which would enable integration into existing image generation pipelines.

MegaStyle Trains FLUX on 1.4M Styled Images

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

Midjourney V8.1 Alpha Ships 3x Faster HD Generation

ComfyUI Adds Frame-Synced AI Music via Sonilo

HeyGen Launches Developer Platform for AI Video Agents

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

Midjourney V8.1 Alpha Ships 3x Faster HD Generation

ComfyUI Adds Frame-Synced AI Music via Sonilo

HeyGen Launches Developer Platform for AI Video Agents

Stay ahead of Creative AI