Lipsync is the simplest way to produce avatar video: pass a public audio URL plus an avatar, and the platform downloads the audio, renders the whole clip in one pass, stores the MP4, and returns its URL. No LLM, no TTS, no streaming — ideal for backend jobs, batch pipelines, and pre-rendered content.

Request

POST /api/v1/lipsync
FieldRequiredDescription
audioUrlyesPublic URL of the speech audio to lipsync.
avatarIdone ofAn avatar you created (ava_…).
portraitUrlone ofA portrait image to register on the fly.
backgroundIdnoBackground id (defaults to plain).
Provide either avatarId or portraitUrl.

Example

import { RealtimeAvatarClient } from "realtime-avatar";

const client = RealtimeAvatarClient.platform({
  apiKey: process.env.REALTIME_AVATAR_API_KEY!,
});

const result = await client.lipsync({
  audioUrl: "https://example.com/speech.mp3",
  avatarId: "ava_…",
});

console.log(result.url);             // public MP4 URL
console.log(result.frames, result.fps, result.durationSeconds);

Response

{
  "url": "https://.../lipsync/lip_….mp4",
  "avatarId": "ava_…",
  "frames": 150,
  "fps": 25,
  "durationSeconds": 6.0
}
Rendering scales with clip length and runs on a GPU, so a lipsync call can take several seconds. Keep request timeouts generous (the SDKs default to a long timeout) and run batches concurrently.
Lipsync bills realtime seconds against your wallet, the same as streaming turns. See Credits and billing.