Vannarot Roeung
Recent posts by Vannarot Roeung
Grok Imagine API Gets File Storage and Public URLs
xAI added file storage, public URLs, and a Files-to-Imagine pipeline to the Grok Imagine API on June 10, 2026, removing the upload-download shuffle from automated image and video edits.
GitHub Copilot CLI Gets Real Code Intelligence
GitHub Copilot CLI can now use Language Server Protocol servers for real code intelligence, replacing text heuristics with precise, type-aware answers across 14 languages.
DiffusionGemma: Google's 4x Faster Open Text Model
Google DeepMind released DiffusionGemma on June 10, 2026, an Apache 2.0 open model that generates text up to 4x faster by denoising blocks of tokens in parallel, and it runs on a single RTX GPU.
Locaible Runs Local AI Agents Inside Cursor
Locaible runs pre-tuned AI coding agents on your own machine and exposes them to Cursor through a local, OpenAI-compatible endpoint, so your code never leaves your machine.
Scribix Transcribes Audio and Video in Your Browser
Scribix is a browser-based AI transcription tool that turns audio and video into editable text with speaker labels, word-level timestamps, and SRT or VTT export.
Xiaomi Open-Sources MiMo Code, a Claude Code Rival
Xiaomi open-sourced MiMo Code, a free MIT-licensed terminal coding agent with persistent memory that runs on its MiMo-V2.5-Pro model and rivals Claude Code.
Luma Ray3.2 Adds Keyframe Control and HDR Video
Luma AI released Ray3.2 on June 9, 2026, a video generation update built around frame-level creative control, with up to 16 keyframes per clip, native HDR, and 16-bit EXR export.
Xcode 27 Agent Skills Now Work in Claude and Cursor
Apple's Xcode 27 ships seven official Agent Skills, and one command exports them into the ~/.agents folder that Claude, Codex, and Cursor already read.
NY Law Requires AI Synthetic Performer Ad Labels
New York's first-in-the-nation rule for AI actors is now live. As of June 9, 2026, any advertisement featuring an AI-generated synthetic performer must carry a clear disclosure.
Gemini 3.5 Live Translate: 70+ Languages, Real Time
Google launched Gemini 3.5 Live Translate, streaming speech-to-speech translation in 70+ languages while keeping the speaker's own voice.
Claude Fable 5 on Bedrock Forces Data Sharing
Anthropic's Mythos-class models, Claude Fable 5 and Mythos 5, just landed on Amazon Bedrock with a mandatory data-sharing condition. To call them, your prompts and completions must leave AWS's security boundary for 30-day retention and human review.
OpenCV 5.0 Turns Vision Into a Local AI Runtime
OpenCV 5.0 now runs LLMs, vision-language models, diffusion, and inpainting natively, turning the most-used vision library into a local runtime for an entire creative pipeline.
Cohere North Mini Code: Open 30B Coding Model
Cohere released North Mini Code 1.0 on June 9, 2026, an open-weights 30B Mixture-of-Experts coding model that runs on modest hardware under Apache 2.0.
Claude Fable 5 vs Opus 4.8: What Creators Get
Anthropic released Claude Fable 5, its most capable model, on June 9, 2026, free on paid plans through June 22. How it compares to Opus 4.8.
Google AI Plus Drops to $4.99 With Video and Image AI
Google cut the price of its AI Plus subscription to $4.99 a month, down from $7.99, and doubled the included cloud storage to 400GB. The change turns the AI subscription fight toward price.
Apple WWDC 2026: New AI Models, Image Playground, Siri AI
Apple WWDC 2026 unveiled five new Foundation Models, a rebuilt Siri AI, Image Playground, and Spatial Reframing, built with Google Gemini collaboration.
TTS Benchmark Ranks 46 Voice Models on Blind Tests
A revamped open-source TTS benchmark now compares 46 text-to-speech models using objective scores and blind human voting, so creators can see which voices actually hold up.
NotebookLM Gets Gemini 3.5 and a Cloud Computer Per Notebook
NotebookLM gains Gemini 3.5 plus a per-notebook secure cloud computer with 100+ skills. Outputs include PDFs, spreadsheets, and slide decks.
OpenAI Reportedly Rebuilding ChatGPT Into an App Platform
OpenAI is reportedly planning to turn ChatGPT into an app platform with Canva, Figma, and Spotify inside chat. A reported plan, not yet shipped.
Xiaomi MiMo Hits 1000 Tokens Per Second on 1T Open Model
Xiaomi MiMo-V2.5-Pro-UltraSpeed claims 1000 tokens per second decode on a 1T MoE. Open-weights FP4 checkpoint plus a 2-week free API trial.
Cursor Canvas Adds Design Mode and Context Usage Report
Cursor shipped two canvas features on June 4: Design Mode for annotating UI elements directly, and a Context Usage Report that audits where the agent spends its tokens.
ChatGPT Dreaming V3: Memory That Updates While You Sleep
ChatGPT Dreaming V3 updates its memory while you sleep, synthesizing conversation patterns into actionable preferences without manual input.
NVIDIA Nemotron 3.5 ASR: 40 Languages at 80ms Latency
NVIDIA released Nemotron 3.5 ASR on June 4: an open 600M streaming speech model covering 40 language-locales with sub-100ms latency for voice agents.
Magenta RealTime 2: Google's Live Music Model Runs on Mac
Google Magenta RealTime 2 runs live music generation locally on Mac and Windows, producing instrument tracks in real time from text prompts.