NVIDIA used Unreal Fest 2026 to put working AI characters in reach of any game creator, not just well-funded studios. On June 16, 2026, the company released the NVIDIA ACE Game Agent SDK in beta alongside three ready-to-use Unreal Engine 5 plugins and an updated DLSS plugin. The pitch is simple: build talking, listening, reasoning non-player characters that run entirely on the player's RTX GPU, with no cloud calls, no per-message API bill, and no network latency.
For creators who have watched AI NPC demos for two years without a practical way to ship one, this is the part that matters. The SDK ships with the models already attached, the Unreal integration already built, and a sample project to copy from. Here is what shipped, how it compares to the cloud approach most teams default to, and how to wire an AI character into a UE5 scene.
What NVIDIA Shipped
The release has three parts. The first is the ACE Game Agent SDK itself, a lightweight C and C++ framework that exposes Agent, Chat, and RAG APIs so a character can hold a conversation, stay in role, and pull in contextual game knowledge. The second is a set of three Unreal Engine 5 plugins that cover the full voice loop. The third is a refreshed DLSS plugin so the AI workload and the renderer share the GPU without tanking frame rate. All of it is available now from NVIDIA's ACE for Games developer page.
The headline design choice is on-device execution. As NVIDIA frames it, "unlike cloud-based services that suffer from high latency and unpredictable operational costs, these plugins offer local, RTX-optimized workflows bundled with ready-to-use models." That bundling is the unlock: a developer does not have to source, quantize, and host a language model before getting a character to talk back. The work that usually eats the first month of an AI-NPC project is done.

The Three Unreal Engine 5 Plugins
Each plugin handles one stage of a spoken interaction, and each arrives with a default model so a creator can hear results on day one. The ACE Unreal Engine plugin documentation covers setup, but the lineup itself tells the story:
Automatic speech recognition. The ASR plugin ships with nemo-conformer-ctc-120m for English transcription, with seven additional language packs available to download. This is what lets a player talk to a character with their voice instead of a text box.
Small language model. The SLM plugin ships with Qwen 3.5 4B, runs from local GGUF files, and supports function calling so a character can trigger in-game actions, not just chat. NVIDIA has been steadily expanding on-device model support, including open-source Qwen models for in-game inferencing, and the 4B size is small enough to leave headroom for the game itself.
Text-to-speech. The TTS plugin ships with Chatterbox Turbo 350M plus example voices, closing the loop so the character speaks its generated reply aloud. Blueprint and C++ support means both technical and visual-scripting teams can use it.
The companion DLSS 4.5 Unreal Engine plugin adds Dynamic Multi Frame Generation, a 6X Multi Frame Generation mode, and

On-Device vs Cloud NPCs: The Real Shift
Most AI NPC prototypes to date call a hosted model over the internet. That works in a demo and breaks in production: every line of dialogue is a billable API request, latency depends on the player's connection, and an offline player gets a mute character. Running the model locally inverts every one of those trade-offs. NVIDIA's approach also targets full digital humans, which outlets like Creative Bloq note is the harder end of the same pipeline.
| Dimension | NVIDIA ACE (on-device) | Cloud-API NPCs |
|---|---|---|
| Per-line cost | None after install | Billed per request |
| Latency | Local, no round trip | Depends on network |
| Offline play | Works fully offline | Character goes silent |
| Player data | Stays on device | Sent to a server |
| Models included | ASR, SLM, TTS bundled | You pick and pay separately |
| Hardware | Requires player GeForce RTX GPU | Runs on any device |
The honest catch is in the last row. On-device inference needs a capable RTX GPU on the player's machine, so a character built this way assumes a PC gaming audience rather than a low-end mobile one. For studios targeting that audience, the cost and latency math is hard to argue with.

How to Add an AI NPC to Your UE5 Project
The intended workflow is to start from the sample and swap in your own character. In practice that looks like this:
1. Install the SDK and plugins. Download the ACE Game Agent SDK and the three UE5 plugins from the ACE for Games page and enable them in your project's plugin settings.
2. Open the sample project. The release includes a working reference scene so you can confirm the ASR, SLM, and TTS chain runs end to end on your hardware before touching your own content.
3. Author the character's role. Use the Agent and Chat APIs to define personality, backstory, and guardrails, then point the RAG API at your game's lore so answers stay in-world.
4. Wire actions with function calling. Because the SLM supports function calling, map intents like "open the gate" or "follow me" to real Blueprint or C++ functions so the character does things, not just talks.
5. Tune for performance. Enable the DLSS 4.5 plugin and profile frame rate with the SLM active, adjusting model and frame-generation settings until dialogue and rendering share the GPU smoothly.
What This Enables for Creators
The immediate win is for small teams. A solo developer or a handful of people can now ship a voice-driven, reactive character without standing up inference infrastructure or signing up for a metered API. That lowers the floor for the kind of emergent, improvisational NPC writing that was previously a research demo. It also fits a broader NVIDIA push into creator and 3D tooling, alongside work like its Cosmos physical-AI tools.
The constraint to plan around is hardware. Because the experience depends on a player-side RTX GPU, this is a fit for PC-first titles, RTX-branded showcases, and indie games whose audience already runs capable graphics cards. If your players are on phones or integrated graphics, the cloud path still has a role. For everyone shipping to the PC gaming audience, the bundled, on-device, no-recurring-cost design removes most of the reasons AI NPCs stayed stuck in prototypes.
Frequently asked questions
What is the NVIDIA ACE Game Agent SDK?
It is a beta C and C++ framework released June 16, 2026 that lets game developers build AI-driven non-player characters with Agent, Chat, and RAG APIs. The characters run on-device on the player's RTX GPU rather than calling a cloud service.
Which models are bundled with the Unreal Engine 5 plugins?
The ASR plugin includes nemo-conformer-ctc-120m for speech recognition, the SLM plugin includes Qwen 3.5 4B for dialogue and function calling, and the TTS plugin includes Chatterbox Turbo 350M for speech output. All three ship ready to use.
Do players need special hardware?
Yes. Because inference runs locally, the experience requires a GeForce RTX GPU on the player's machine. This makes ACE NPCs a fit for PC-first games rather than mobile or low-end systems.
How is this different from cloud-based AI NPCs?
Cloud NPCs bill per request, depend on network latency, and stop working offline. ACE runs the models on the device, so there is no per-line cost, no network round trip, and characters keep working without a connection.
Does it work with Unreal Engine 5 Blueprints?
Yes. The plugins support both Blueprint and C++, so visual-scripting teams and code-first teams can both integrate AI characters and map dialogue intents to in-game actions through function calling.
Where can I download it?
The ACE Game Agent SDK and the Unreal Engine 5 plugins are available from NVIDIA's ACE for Games developer page, with the DLSS 4.5 plugin available from the NVIDIA RTX DLSS page.