The 3D Gaussian Splatting tools market hit $1.2 billion in 2024 and is on track for $7.8 billion by 2033. That is a 23.6% compound annual growth rate for a rendering technique that did not exist before late 2023.
AI-powered 3D generation has crossed a threshold in early 2026. Open-source models like Microsoft's TRELLIS.2 and Tencent's Hunyuan3D-2.1 now produce production-quality meshes in seconds. Commercial platforms like Rodin Gen-2, Meshy, and Tripo3D have built pricing tiers that make AI 3D accessible to solo creators. Meanwhile, Gaussian splatting has moved from research curiosity to production pipeline, showing up in Hollywood films, real estate platforms, and e-commerce. This report draws on HuggingFace model data, arXiv research trends, product announcements, and pricing pages to map the state of AI 3D generation and spatial computing as of March 2026.
Key Findings
1. TRELLIS.2 Sets a New Open-Source Benchmark
Microsoft's TRELLIS.2 is a 4-billion parameter model built for image-to-3D generation. It uses a novel "field-free" sparse voxel structure called O-Voxel paired with a large-scale flow-matching transformer. The results are striking: a fully textured 3D asset at 512-cubed resolution in roughly 3 seconds, or 1024-cubed in about 17 seconds. The model outputs complete PBR (Physically Based Rendering) materials including base color, roughness, metallic, and opacity channels.
On HuggingFace, TRELLIS-text-xlarge leads the text-to-3D pipeline with 26,659 downloads and 58 likes, while the TRELLIS Space has accumulated 4,776 likes. The original TRELLIS paper earned a CVPR 2025 Spotlight.
| Model | Downloads | Likes | Created |
|---|---|---|---|
| TRELLIS-text-xlarge | 26,659 | 58 | Mar 2025 |
| TRELLIS-text-large | 8,865 | 13 | Mar 2025 |
| OpenAI Shap-E | 3,196 | 260 | Jul 2023 |
| BrickGPT | 2,997 | 15 | May 2025 |
| LLaMA-Mesh | 2,331 | 34 | Nov 2024 |
2. Hunyuan3D-2.1 Brings Full PBR to Open Source
Tencent's Hunyuan3D-2.1 takes a two-stage approach: first generating a bare mesh with Hunyuan3D-DiT, then synthesizing textures with Hunyuan3D-Paint. The system supports text-to-3D, image-to-3D, and sketch-to-3D workflows. Hunyuan3D-Paint projects PBR textures from multiple angles simultaneously, creating seamless UV maps for Albedo, Normal, Roughness, and Metallic channels.
Generation speed sits at roughly 10 to 25 seconds depending on model size. The Hunyuan3D-2 Space on HuggingFace holds 3,236 likes, making it the second most popular 3D generation demo on the platform. Tencent also released HY-Motion-1.0 for 3D motion generation, accumulating 384 likes.
3. Gaussian Splatting Replaces NeRF in Production Pipelines
Gaussian splatting represents 3D scenes as millions of fuzzy ellipsoids that render in real time on consumer hardware. Unlike Neural Radiance Fields (NeRFs), which encode scenes inside neural network weights and render at 1 to 10 FPS, 3D Gaussian Splatting uses explicit point-based representations that train in 7 to 45 minutes and render at interactive framerates.
The keyword "gaussian splatting" appeared 6 times in the top arXiv papers over the past month, while "3D reconstruction" appeared 4 times. The technique does not use neural networks at all, relying instead on mathematical optimization and tile-based GPU rasterization.
| Feature | NeRF | Gaussian Splatting |
|---|---|---|
| Representation | Implicit (neural network) | Explicit (3D Gaussians) |
| Training time | Hours | 7-45 minutes |
| Render speed | 1-10 FPS | Real-time (30+ FPS) |
| Scene editing | Difficult | Direct manipulation |
| Lighting detail | Superior | Improving rapidly |
| File size | Small (network weights) | Larger (point clouds) |
| Industry adoption | Declining | Accelerating |
The production of Superman became the first major motion picture to ship with dynamic Gaussian Splatting. Zillow launched Gaussian Splatting-powered property tours via SkyTours. Apple released SHARP, a technique that generates high-quality 3D Gaussian Splats from a single image.
4. Commercial Platforms Compete on Speed, Quality, and Price
The commercial AI 3D generation market has matured into distinct tiers. Rodin Gen-2 from Hyper3D leads on quality with a 10-billion parameter model that produces 4x improved mesh quality. Its Gen-2 Edit feature allows voice-driven editing of 3D models, a first in the industry. Meshy targets teams with strong PBR support and its Meshy-6 model. Tripo3D v3.0 Ultra generates clean quad-based topology ideal for game engines. Luma Genie emphasizes speed, producing models in under 10 seconds.
| Platform | Starting Price | Key Strength | Output Format |
|---|---|---|---|
| Rodin Gen-2 | Free trial + credits | 10B params, voice editing | FBX, GLB, OBJ |
| Meshy | $15/mo | Team workflows, PBR | FBX, GLB, OBJ, USDZ |
| Tripo3D | $11.94/mo | Quad topology for games | FBX, GLB, OBJ |
| Luma Genie | $10/mo | Sub-10s generation | Quad mesh |
| CSM | ~$0.75/model | Part separation | GLB, OBJ |
All major platforms now export directly to Unity, Unreal Engine, and Blender. Rodin Gen-2 grants full commercial rights on all plans including free credits.
5. Spatial Computing Splits Into Two Ecosystems
Apple and Meta are building parallel spatial computing ecosystems with very different strategies. The Vision Pro 2 launched in late 2025 with the M5 chip, delivering 2x GPU and AI performance over the original. Its dual 23MP passthrough cameras produce real-time depth mapping with dynamic lighting correction, so placed 3D objects reflect ambient light and cast accurate shadows.
Meta's Quest 3 ecosystem takes a volume approach. The Quest 3S targets enterprise deployments where cost per unit matters, with AI@Meta providing PyTorch Mobile optimizations, pre-trained spatial foundation models (including Segment Anything for 3D), and Unity plugin support via Reality Composer Pro.
Enterprise pilots from early 2026 report a 30% reduction in training time with Vision Pro 2 compared to screen-based methods, and a 40% adoption rate in spatial computing pilots across Fortune 500 companies.
6. Research Points Toward Real-Time 3D Understanding
The arXiv data reveals an emerging pattern: spatial reasoning is converging with 3D generation. Papers like "Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors" and "Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation" show researchers building bridges between language models and 3D geometry. The keyword "spatial reasoning" appeared 3 times in the latest batch, and "3D reconstruction" appeared 4 times.
The HuggingFace papers trending list includes "LoST: Level of Semantics Tokenization for 3D Shapes" (14 upvotes) and "Stereo World Model: Camera-Guided Stereo Video Generation" (8 upvotes), both tackling the intersection of 2D understanding and 3D output.
7. Game Engines Add Native AI 3D Pipelines
Unity 6.2 introduced Unity AI, a suite of generative AI tools for code and game asset creation. The Unity AI Gateway and Platform Toolkit enable secure integration of third-party AI agents, including 3D generation models. Unity 6.3 LTS added Mesh LOD for automatic Level of Detail generation from imported 3D meshes.
Unreal Engine continues to expand its AI integration surface. Both engines are used by 32% of developers according to recent surveys, and the global game engine market is projected to reach $11.04 billion by 2033 at a 13.9% CAGR.
The global 3D modeling market itself is projected to reach $6.4 billion by 2026, with AI 3D tools reducing asset creation time by up to 70%.
Trend Analysis
Open Source Is Winning the Quality Race
TRELLIS.2 and Hunyuan3D-2.1 are both fully open-source and both produce PBR-ready assets with full material channels. This is a significant shift from 2024, when the best 3D generation required commercial APIs. The open-source models now match or exceed closed alternatives on mesh quality, texture fidelity, and generation speed. Commercial platforms are responding by competing on workflow integration, editing capabilities, and API reliability rather than raw generation quality.
The Gaussian Splatting Standard Is Emerging
3D Gaussian Splatting is becoming what JPEG is to images: a practical, interoperable format for captured 3D content. The software segment already accounts for over 55% of the $1.2 billion market. Apple's SHARP technique for single-image splat generation signals that the major platform holders see Gaussian Splatting as infrastructure, not novelty. Dynamic 4D Gaussian Splatting is the next frontier, with early applications in volumetric video for concerts and sports broadcasts.
3D Generation Speed Has Crossed the Interactive Threshold
TRELLIS.2 generates a textured asset in 3 seconds at 512-cubed resolution. Luma Genie produces models in under 10 seconds. This speed makes AI 3D generation viable inside interactive design loops for the first time. A concept artist can type a description, see the 3D result, adjust, and regenerate faster than they could sketch the same object by hand.
Predictions
1. Browser-Based 3D Generation Becomes Standard by Late 2026
With TRELLIS.2 running inference at 3 seconds on a single GPU, expect cloud-hosted 3D generation to appear as a native feature in Figma, Canva, and similar design tools. The technical barrier is no longer model quality but WebGPU rendering support in browsers.
2. Gaussian Splatting Replaces Photogrammetry for Most Capture Use Cases
Photogrammetry requires carefully controlled capture conditions and hours of processing. Gaussian splatting produces comparable results from casual smartphone video in minutes. By the end of 2026, expect the major real estate, e-commerce, and cultural heritage platforms to default to Gaussian Splatting over photogrammetry for new captures.
3. AI 3D Editing Will Matter More Than AI 3D Generation
Rodin Gen-2 Edit already allows voice-driven modifications to existing 3D models. As generation quality plateaus across platforms, the competitive advantage will shift to editing, rigging, animation, and pipeline integration. The model that lets you say "make the chair legs thinner and add a walnut texture" and get a clean result will win over the model that generates a slightly better chair from scratch.
4. Spatial Computing Drives a New Asset Format War
Apple's USDZ and Meta's glTF/GLB ecosystems are creating fragmentation in 3D asset delivery. Expect a push toward universal spatial asset formats by late 2026, likely built on top of OpenUSD, as neither platform can afford to lock out half the content ecosystem.
5. Game Studios Will Ship AI-Generated Background Assets in AAA Titles by Q4 2026
The quad-topology output from Tripo3D v3.0 and the PBR materials from TRELLIS.2 are already game-engine-ready. The remaining gap is art direction consistency across thousands of assets, which fine-tuning and LoRA-style adaptation will close this year.
What This Means for Creators
For 3D Artists
AI 3D generation is not replacing 3D artists. It is eliminating the tedious first 80% of asset creation. The value of a skilled 3D artist is shifting from polygon pushing to art direction, quality control, and the final 20% of polish that makes an asset feel intentional rather than generated. Learn to use TRELLIS.2 and Hunyuan3D-2.1 as starting points. Master the editing workflows in Rodin Gen-2. Your competitive advantage is taste, not technique.
For Game Developers
The practical path right now: use Tripo3D or Meshy for background and prop assets, keep hero assets hand-crafted, and build a QA pipeline for AI-generated meshes. Unity AI and the expanding plugin ecosystem make integration straightforward. Budget for API costs in your production plan. At $0.01 per credit on Tripo or $15/month on Meshy, AI 3D generation is cheaper than a single hour of freelance 3D modeling.
For XR Creators
If you are building for Vision Pro, invest in understanding Apple's USDZ pipeline and Core ML integration. If you are building for Quest, lean into Meta's PyTorch Mobile and glTF ecosystem. Either way, Gaussian Splatting capture is a skill worth learning now. The ability to walk into a space, capture it with a phone, and have a real-time navigable 3D scene in minutes is transformative for location-based XR experiences.
Full Data: AI 3D Generation Landscape
| Tool / Model | Type | Parameters | Speed | Open Source | PBR Support |
|---|---|---|---|---|---|
| TRELLIS.2 | Image-to-3D | 4B | ~3s (512) | Yes | Full |
| Hunyuan3D-2.1 | Multi-modal | N/A | 10-25s | Yes | Full |
| Rodin Gen-2 | Image-to-3D + Edit | 10B | ~30s | No | Full |
| Meshy (Meshy-6) | Text/Image-to-3D | N/A | ~15s | No | Full |
| Tripo3D v3.0 | Text/Image-to-3D | N/A | ~10s | No | Partial |
| Luma Genie | Text-to-3D | N/A | <10s | No | Partial |
| CSM | Image-to-3D | N/A | ~20s | No | Basic |
| Shap-E | Text-to-3D | N/A | ~15s | Yes | No |
| LLaMA-Mesh | Text-to-Mesh | N/A | Variable | Yes | No |
| 3D Gaussian Splatting | Capture/Render | N/A | 7-45min train | Yes | N/A |
This research was produced by Creative AI News.
Subscribe for free to get the weekly digest every Tuesday.