Non-streaming lipsync - Realtime Avatar

Lipsync is the simplest way to produce avatar video: pass a public audio URL plus an avatar, and the platform downloads the audio, renders the whole clip in one pass, stores the MP4, and returns its URL. No LLM, no TTS, no streaming — ideal for backend jobs, batch pipelines, and pre-rendered content.

Request

POST /api/v1/lipsync

Field	Required	Description
`audioUrl`	yes	Public URL of the speech audio to lipsync.
`avatarId`	one of	An avatar you created (`ava_…`).
`portraitUrl`	one of	A portrait image to register on the fly.
`backgroundId`	no	Background id (defaults to plain).

Provide either avatarId or portraitUrl.

Example

import { RealtimeAvatarClient } from "realtime-avatar";

const client = RealtimeAvatarClient.platform({
  apiKey: process.env.REALTIME_AVATAR_API_KEY!,
});

const result = await client.lipsync({
  audioUrl: "https://example.com/speech.mp3",
  avatarId: "ava_…",
});

console.log(result.url);             // public MP4 URL
console.log(result.frames, result.fps, result.durationSeconds);

Response

{
  "url": "https://.../lipsync/lip_….mp4",
  "avatarId": "ava_…",
  "frames": 150,
  "fps": 25,
  "durationSeconds": 6.0
}

Rendering scales with clip length and runs on a GPU, so a lipsync call can take several seconds. Keep request timeouts generous (the SDKs default to a long timeout) and run batches concurrently.

Lipsync bills realtime seconds against your wallet, the same as streaming turns. See Credits and billing.

Authentication Realtime turns

​Request

​Example

​Response

Request

Example

Response