RTVI projects originate with a client that:

  • Configures your AI services and pipeline.
  • Provides a start() method that handles the handshake and authentication with your bot service. Typically, calling this method will make a web request to an endpoint you create, initiating a new bot process and returning authentication credentials to join a session.
  • Manages media transport.
  • Provides methods, callbacks and events for interfacing with your bot.

VoiceClient

import { VoiceClient } from "realtime-ai";

VoiceClient is a base class that serves as a template for building transport-specific implementations. It has no out-of-the-box bindings included.

The core realtime-ai library exports a VoiceClient that has no associated transport logic. When building an RTVI application, you should use the export for your chosen transport layer or provider (see here for available first-party packages.)

If you were looking to use WebRTC as a transport layer, you may use a provider like Daily. In this scenario, you would use the Daily package and import the client accordingly:

import { DailyVoiceClient } from "realtime-ai-daily";

const voiceClient = new DailyVoiceClient(...);

All packaged voice clients (such as DailyVoiceClient) extend from VoiceClient. You can extend this class if you are looking to implement your own transport or add additional functionality.

Example

import { VoiceEvent, VoiceMessage } from "realtime-ai";
import { DailyVoiceClient } from "realtime-ai-daily";

const voiceClient = new DailyVoiceClient({
  baseUrl: "https://your-end-point-here",
  enableMic: true,
  timeout: 15 * 1000,
  services: {
    llm: "together",
    tts: "cartesia",
  },
  config: [
    {
      service: "tts",
      options: [
        { name: "voice", value: "79a125e8-cd45-4c13-8a67-188112f4dd22" },
      ],
    },
    {
      service: "llm",
      options: [
        { name: "model", value: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo" },
        {
          name: "messages",
          value: [
            {
              role: "system",
              content:
                "You are a assistant called ExampleBot. You can ask me anything. Keep responses brief and legible. Your responses will be converted to audio, so please avoid using any special characters except '!' or '?'.",
            },
          ],
        },
      ],
    },
  ],
  callbacks: {
    onConnected: () => {
      console.log("[CALLBACK] User connected");
    },
    onDisconnected: () => {
      console.log("[CALLBACK] User disconnected");
    },
    onTransportStateChanged: (state: string) => {
      console.log("[CALLBACK] State change:", state);
    },
    onBotConnected: () => {
      console.log("[CALLBACK] Bot connected");
    },
    onBotDisconnected: () => {
      console.log("[CALLBACK] Bot disconnected");
    },
    onBotReady: () => {
      console.log("[CALLBACK] Bot ready to chat!");
    },
  },
});

try {
  await voiceClient.start();
} catch (e) {
  console.error(e.message);
}

// Events
voiceClient.on(VoiceEvent.TransportStateChanged, (state) => {
  console.log("[EVENT] Transport state change:", state);
});
voiceClient.on(VoiceEvent.BotReady, () => {
  console.log("[EVENT] Bot is ready");
});
voiceClient.on(VoiceEvent.Connected, () => {
  console.log("[EVENT] User connected");
});
voiceClient.on(VoiceEvent.Disconnected, () => {
  console.log("[EVENT] User disconnected");
});

API reference

Required properties

baseUrl
string
required

URL to a developer hosted endpoint that triggers authentication, transport session creation and bot instantiation.

By default, the VoiceClient will send a JSON POST request to this URL and pass the local configuration (config) as a body param. You can override this behaviour by defining a customAuthHandler property.

Depending on your Transport, this endpoint should return a valid authentication bundle (e.g. URL, access token etc) required to join a session.

services
Object <{ [key: string]: string }>
required

A key value object service registration, each representing an available service (such as OpenAI, ElevenLabs, or a local LLM etc) that is made available in your bot file.

PropertyDescription
keyexact match for service type in your bot file, e.g. llm.
valuethe service which you wish to use, matching a registered service in your bot file, e.g. openai.

For more information, please refer to services.

config
Array <VoiceClientConfigOption[]>
required

Pipeline configuration object for your registered services. Must contain a valid VoiceClientConfig array.

Client config is passed to the bot at startup, and can be overriden in your server-side endpoint (where sensitive information can be provided, such as API keys.)

The service order of your configuration array is important. See configuration.

Optional properties

callbacks
VoiceEventCallbacks

Map of callback functions. See callbacks

transport
Class<Transport>
default: "undefined"

Optional Transport class. If your transport package (e.g. realtime-ai-daily) exports multiple transports classes, you can specify which to use here.

timeout
number
default: "undefined"

Time (in milleseconds) to wait for the bot to enter a ready state after calling start(). If the timeout period is elapsed, the client will return a ConnectionTimeoutError error and disconnect from the transport.

enableMic
boolean
default: "true"

Enable user’s local microphone device.

enableCamera
boolean
default: "false"

Enable user’s local webcam device. Note: please ensure you are using a transport package that supports video.

Start properties

customHeaders
Object <{ [key: string]: string }>

Custom HTTP headers to include in the initial start web request to the baseUrl.

customBodyParams
Object

Custom request properties that are passed through to base URL as part of the start method.

customAuthHandler
(baseUrl, timeout, abortController) => Promise<void>

Override the default fetch query called by voiceClient.start().

  • baseUrl:string Endpoint provided in client constructor.
  • timeout:Timeout | undefined Start timeout. You should clear this once your method resolves e.g. clearTimeout(timeout.
  • abortController:AbortController - Abort via abortController.abort() in the event of an error response will clear the connection timeout and set the client to an error state.
const voiceClient = new DailyVoiceClient({
  baseUrl: "...",
  customAuthHandler: async (
    baseUrl: string,
    timeout: number | undefined,
    abortController: AbortController
  ): Promise<void> => {
    return await fetch(`${baseUrl}`, {
      method: "POST",
      mode: "cors",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ...`,
      },
      body: JSON.stringify({
        services: voiceClient.services,
        config: voiceClient.config,
        // ... your custom body params here
      }),
      signal: abortController?.signal,
    }).then((res) => {
      clearTimeout(timeout); // Clear the start timeout (if set)

      if (res.ok) {
        return res.json();
      }
      return Promise.reject(res);
    });
  },
});