Xiaomi MiMo-V2 Ships Multimodal and TTS Models
Xiaomi has launched the MiMo-V2 family, featuring a multimodal model that processes audio, images, and video alongside a TTS engine capable of contextual emotion and singing synthesis.
Deep dives, tutorials, and analysis for AI-powered creators.
Xiaomi has launched the MiMo-V2 family, featuring a multimodal model that processes audio, images, and video alongside a TTS engine capable of contextual emotion and singing synthesis.
Val Kilmer will appear in As Deep as the Grave through a fully AI-generated performance, marking the first movie role created entirely with generative AI.
OpenAI has launched Parameter Golf, an open research competition offering $1M in computing credits to whoever can compress AI models into just 16MB with only 10 minutes of training time.
Anthropic has published the largest multilingual qualitative AI study ever conducted, interviewing 81,000 people across 159 countries in 70 languages about what they want from AI.
Corridor Digital co-founder Niko Pueringer releases CorridorKey, an AI chroma keyer that handles hair, motion blur, and transparency at VFX-production quality.
MiniMax, the company behind the Hailuo video generation platform, released M2.7 with 10 billion activated parameters and built-in self-improvement capabilities.
Google Labs launched Stitch, a free AI-native design platform that translates natural language descriptions into high-fidelity UI designs through what the company calls vibe design.
Apple is blocking App Store updates for AI vibe coding apps like Replit and Vibecode, enforcing existing rules against on-device code execution that could reshape mobile AI development.
Rebel Audio has launched out of stealth with $3.8 million in seed funding and Mark Burnett as advisor, positioning itself as the Canva of podcasting for first-time creators.
BandM8 debuts at NVIDIA GTC with a music-to-music AI platform that generates real-time MIDI accompaniment from a single instrument.
NVIDIA unveiled Cosmos 3 at GTC, the first world foundation model that unifies synthetic world generation, physical AI reasoning, and action simulation.
SD3.5-Flash generates images in just 4 steps instead of the usual 30 to 50, making on-device AI image generation practical on smartphones and laptops with under 8GB of RAM.
OpenArt Worlds generates navigable 3D environments from text prompts using World Labs spatial AI, letting creators walk through scenes and capture shots from any angle.
Amazon Ads launches Creative Agent, a free AI tool that handles scriptwriting, image generation, video, voiceovers, and music for ads on Amazon.
Anthropic has launched Dispatch, a new research preview for Claude Cowork that lets users send AI tasks from their phone and have Claude execute them on their Mac desktop.
Google has released an open-source MCP server for Google Colab, letting AI agents like Claude Code and Gemini CLI create notebooks, execute Python code, and access cloud GPUs directly.
Midjourney released V8 Alpha on March 17, bringing 5x faster generation, native 2K rendering, and significantly improved text rendering to its AI image platform.
OpenAI released GPT-5.4 mini and GPT-5.4 nano on March 17, bringing the capabilities of its flagship GPT-5.4 model to significantly smaller, faster, and cheaper packages.
H Company released Holotron-12B on March 17, an open-source computer use agent model that delivers more than 2x the throughput of its predecessor on a single H100 GPU.
Gamma, the AI-powered presentation and design platform with over 100 million users, launched Gamma Imagine on March 17 with AI image generation for brand-specific visual assets.
Mistral launched Forge at NVIDIA GTC, a platform that lets enterprises train frontier-grade AI models from scratch using their own proprietary data.
Maxon has released Redshift 2026.4, introducing real-time architectural visualization that lets architects and 3D artists walk through projects at interactive speed.