RTVI is an open standard that aims to support a wide variety of use cases.

The core functionality of the SDK has a fairly small footprint:

  • Device and media stream management
  • Managing bot configuration
  • Sending angeneric actions to the bot
  • Handling bot messages and responses
  • Managing session state and errors

To connect to a bot, you will need both this SDK and a transport implementation.

It’s also recommended for you to stand up your own server-side endpoints to handle authentication, and passing your bot process secrets (such as service API keys, etc) that would otherwise be compromised on the client.

How can RTVI be used?

A client application can use the RTVI SDK to connect to a bot, send actions, and receive messages.

Clients are multi-modal, in that they can send and receive audio and video data, as well as text.

For real-time streaming use-cases, a client will need to connect via the connect() method, or send single-turn actions via the action() method.

Single-turn actions

Single-turn actions are a way to run inference on your pipeline whilst not in a connected state. A typical use-case for this would be a text-to-text or text-to-voice chatbot, where the user inputs text and receives a response.

Real-time connected use-cases

Connected use-cases require establishing an on-going connection to your pipeline, such as with WebSockets or WebRTC. This is useful for real-time voice and video chatbots, where the user can speak and receive a response in real-time.