ElevenLabs launched Avatars on June 11, 2026, a talking-head video tool that pairs its speech models with lip-sync in one interface and lands the voice company squarely in HeyGen's territory. If you make spokesperson clips, localized ads, or UGC-style shorts, you now have two serious options with very different DNA. This guide breaks down where each one wins, what they cost, and which fits your workflow.
What Happened
Avatars is a new entry point inside ElevenCreative. You pick or build an avatar, write a script, choose a voice, and generate, with text to speech wired directly into the prompt so the voiceover and lip-synced video are produced together rather than stitched across two tools. ElevenLabs frames an avatar as a persistent visual identity you build from reference images or a text prompt and reuse across any number of videos.
The launch also adds an Avatar node to Flows, ElevenLabs' automation canvas. That node can take a product brief, generate a script, create a voiceover, and produce an avatar-led video, then batch the whole chain across products, languages, and hooks. Partner models are plugging in too: Creatify's Aurora avatar model is already available inside the new interface. Avatars is live on every paid ElevenLabs plan.

ElevenLabs Avatars vs HeyGen: The Comparison
HeyGen has spent years building photorealistic avatars and now runs five public tiers with its Avatar IV and V models, 175-plus language translation, and a credit system. ElevenLabs arrives voice-first, with best-in-class text to speech and batch automation baked in. Here is how the two stack up on the dimensions that matter to working creators.
| Dimension | ElevenLabs Avatars | HeyGen |
|---|---|---|
| Core strength | Voice-first: top-tier TTS plus lip-sync in one prompt | Photorealistic avatars, mature spokesperson video |
| Entry price | All paid plans, from Starter at $6/mo | Free (3 videos/mo), Creator at $29/mo |
| Avatar creation | Persistent identity from your images or a text prompt | Avatar library plus custom-trained avatars |
| Automation | Avatar node in Flows for batch pipelines | Video Agent and API workflows |
| Languages | Strong multilingual TTS across dozens of languages | 175-plus languages for video translation |
| Best for | Voice-driven content and localized batch UGC ads | Corporate training and polished presenter video |
The practical split: HeyGen is the safer pick when the face has to read as a convincing human spokesperson for training or marketing, where its Avatar IV credit pricing of roughly 20 credits per minute is the cost to watch. ElevenLabs wins when the voice carries the content and you need to fan one script out into many languages and variants fast.
Maturity is the other axis. HeyGen has shipped multiple avatar generations, and its lip-sync now reads as convincingly human in most corporate contexts, which is why global marketing teams treat it as a default. ElevenLabs is on day one with avatars, so the visual fidelity is unproven at scale even though its voice layer is already best in class. The bet ElevenLabs is making is that creators care more about a tight script-to-video loop and automation than about squeezing out the last few percent of photorealism, and that the voice is what most viewers actually judge.
Where Each Tool Fits a Real Pipeline
Think in terms of the job, not the brand. For a SaaS company localizing a product explainer into twelve markets, ElevenLabs' Avatar node in Flows turns a single brief into twelve narrated videos overnight, and the multilingual voices are the selling point. For an HR team filming a compliance course where a believable on-screen presenter builds trust, HeyGen's seasoned avatars and translation catalog still set the bar.
Solo creators sit in the middle. A podcaster cutting vertical promos cares about speed and cost more than studio polish, which favors ElevenLabs' low entry tiers. An agency billing clients for branded spokesperson content can justify HeyGen's credits because the output has to survive a client review. The honest answer for most teams is that these tools are now close enough to warrant running the same script through both before a major campaign.

How to Make Your First Avatar Video in ElevenLabs
The workflow is built to collapse scripting, voice, and video into a single pass. Here is the fastest path from idea to a finished clip.
1. Open the Avatars tool. Inside ElevenCreative, choose Avatars as your starting modality. Any paid plan unlocks it.
2. Pick or build an avatar. Select one from the curated library, filterable by age, gender, and use case, or upload reference photos of a person, character, or animal to create a persistent identity you can reuse later.
3. Write the script and choose a voice. Type your script and assign an ElevenLabs voice. Text to speech runs inside the prompt, so the audio and lip-sync are generated together.
4. Generate and refine. Produce the clip, then adjust style variations like camera angle, outfit, or background while keeping the character consistent.
5. Scale it in Flows. Drop the Avatar node into a Flow to batch the same script across products, languages, and hooks without rebuilding each video by hand.

Why It Matters for Creators
For anyone running a content pipeline, the news is consolidation. A voiceover artist's workflow used to mean writing in one app, generating voice in ElevenLabs, then importing into a separate avatar tool for lip-sync. Avatars folds those three steps into one and adds batch automation on top, which is the part HeyGen users have long paid credit premiums to reach.
It also resets the price floor for talking-head video. HeyGen gates avatar volume behind credits that burn fast at 20 per minute, so a heavy localized campaign can get expensive. ElevenLabs putting avatars on every paid tier, including the $6 Starter plan, pressures that model. For creators producing dozens of short variants a week, the math now favors testing both before committing a budget.
What to Do Next
If you already pay for ElevenLabs voices, open ElevenCreative and build one persistent avatar from your own reference images this week, then run a three-language Flow to see how the batch automation holds up. If you live in HeyGen for client work, keep it for high-stakes presenter video but price out a localized campaign in both tools before your next big run.
Frequently asked questions
Is ElevenLabs Avatars free?
No. Avatars is available on every paid ElevenLabs plan, which start at the Starter tier around $6 per month. The free plan does not include it.
How is ElevenLabs Avatars different from HeyGen?
ElevenLabs is voice-first, generating the voiceover and lip-synced video together in one prompt and emphasizing batch automation through Flows. HeyGen is avatar-first, with more mature photorealistic presenter models and 175-plus language video translation.
Can I create an avatar from my own photos?
Yes. You can upload reference images of a person, character, or animal, and ElevenLabs builds a persistent identity you can reuse across any number of future videos.
Which tool is cheaper for talking-head video?
It depends on volume. ElevenLabs puts avatars on low-cost paid tiers, while HeyGen uses credits that consume roughly 20 per minute of Avatar IV video. For high-volume localized output, ElevenLabs is often cheaper; for occasional polished clips, HeyGen's free and Creator tiers may suffice.
Does ElevenLabs Avatars support batch video generation?
Yes. The new Avatar node in Flows can take a product brief, write a script, generate a voiceover, and produce an avatar video, then run that chain across multiple products, languages, and hooks automatically.
What lip-sync models does ElevenLabs use?
ElevenLabs pairs its speech models with leading lip-syncing models and is opening the interface to partners. Creatify's Aurora avatar model is already available inside ElevenCreative.