AvatarPlayer (from realtime-avatar/browser) decodes the avatar mux stream and renders it to a <canvas> with an audio-clocked loop, so audio and video stay in sync. It handles JPEG/I420 decode, buffering, and frame dropping for you.

Minimal player

import { RealtimeAvatarClient } from "realtime-avatar";
import { AvatarPlayer } from "realtime-avatar/browser";

const client = RealtimeAvatarClient.webProxy(); // browser-safe; key stays server-side

const player = new AvatarPlayer();
player.attach(document.querySelector("canvas")!);

// Unlock audio inside a user gesture (browser autoplay policy):
playButton.addEventListener("click", async () => {
  await player.unlock();
  await client.prepare({ avatar_id: "ava_…" });
  const stream = await client.turn({ avatar_id: "ava_…", mode: "speak_text", text: "Hi!" });
  await player.play(stream);
});
Browsers start audio suspended until a user gesture. Call player.unlock() from a click/tap before the first turn, or the avatar will appear to play silently.

React

import { useAvatarSession, AvatarStage } from "realtime-avatar/react";

function Avatar() {
  const session = useAvatarSession({ avatarId: "ava_…" });
  return (
    <>
      <AvatarStage session={session} />
      <button onClick={() => session.speak("Hello!")}>Speak</button>
    </>
  );
}

Lifecycle

  • attach(canvas) — bind (or rebind) the canvas.
  • unlock() — resume the audio context from a gesture; safe to call repeatedly.
  • play(stream) — resolves only after playout fully drains (no end-of-turn freeze).
  • stop() / dispose() — stop the current turn / release the audio context on unmount.