Actions are service-specific messages that are dispatched to the bot in order to trigger certain pipeline behaviour.

Some examples of actions include:

  • tts:say - Speak a message using text-to-speech.
  • tts:interrupt - Interrupt the current text-to-speech message.
  • llm:append_to_messages - Append a context to the current LLM messages.

Under the hood, actions are blobs of data that are sent via the transport layer to the bot. The bot listens for and then processes the action to perform the necessary operations.

RTVI-enabled bots will have defined actions available for the user to invoke in their client.

Actions do not trigger events or callbacks in the client, but instead return a Promise that resolves once the bot has processed the action. Actions are useful ways of extending the functionality of an RTVI bot without needing to modify the client.

Actions differ from messages in that a) they are service specific b) they do not trigger callbacks or events and c) they return a Promise that resolves once the bot has processed the action.

Obtaining available actions

To obtain a list of available actions, you can use the describeActions method on the RTVI client.

const actions = await voiceClient.describeActions();

This will return an array of action objects that you can use to determine which actions are available.

{
  "label": "rtvi-ai",
  "type": "actions-available",
  "id": "UNIQUE_ID_FROM_REQUEST",
  "data": {
    "actions": [
      {
	      "service": "tts",
	      "action": "say",
	      "arguments": [
		      { "name": "text", "type": "string" },
		      { "name": "interrupt", "type": "bool" },
		      ...
	      ]
      },
      ...
    ]
  }
}

Anatomy of an action

An action object has the following properties:

  • service - The service that the action belongs to.
  • action - The name of the action, as defined by the bot.
  • arguments - An array of argument objects that the action accepts.

When the client dispatches an action, it will pass the action name and arguments to the bot, alongside a unique ID that is referenced on response to resolve the Promise.

RTVI voice clients maintain a queue of actions that are dispatched to the bot. The action is sent as JSON data via the transport layer to the bot, which processes the action.

Once the action has been processed the bot will send a response message with the same unique ID that the client uses to reference the action queue to resolve the Promise.

Dispatching an action

const someAction = await voiceClient.action({
  service: "tts",
  action: "say",
  arguments: [
    { name: "text", value: "Hello, world!" },
    { name: "interrupt", value: false },
  ],
});

// > Promise<VoiceMessageActionResponse>

An action is resolved or rejected with the following:

{
  "label": "rtvi-ai",
  "type": "action-response",
  "id": "UNIQUE_ID_FROM_REQUEST",
  "data": {
    "result": BOOL | NUMBER | STRING | ARRAY | OBJECT
  }
}

If an action is unable to be processed, the bot will return a error-response typed message with the same unique ID assigned by the client, triggering a rejection.

You can handle error responses in multiple ways:

  • onMessageError callback
  • MessageError event
  • try / catching the Promise
try {
	const someAction = await voiceClient.action({...});
} catch(e) {
	console.error(e);
}

Action response data

Some actions resolve with data. This data is specific to the action and is defined by the bot.

A successful action will resolve with VoiceMessageActionResponse:

{
  "label": "rtvi-ai",
  "type": "action-response",
  "id": "UNIQUE_ID_FROM_REQUEST",
  "data": {
	"result": "Hello, world!"
  }
}

Awaiting an action will return the VoiceMessageActionResponse object, which contains the result property.

try {
	const someAction = await voiceClient.action({...});
	console.log(someAction.data.result);
} catch(e) {
	console.error(e);
}