Version: Latest

Voice Conversation Repair

Conversation Repair refers to the ability of an assistant to handle conversations that deviate from the happy path. Rasa handles Conversation Repair automatically using a set of customizable patterns.

New in 3.11

Rasa Pro now comes with voice-first Conversation Repair capabilities. They complement the text-first Conversation Repair capabilities.

Overview

Session Start

Session Start defines the behaviour when an assistant proactively starts the conversation. This pattern is triggered by the message /session_start which can be sent on behalf of the user based on certain UI interactions.

This pattern can be used along with a customized action_session_start for dynamic greetings.

Example: Bot starts a conversation

Bot: Hi Rahul, your Stitch Craft order #1377 is on its way. Would you like to track it or need help with something else?

User: ...

Example showcasing a proactive bot

Repeat

Beta feature

This feature is currently released in a beta version and disabled by default. The feature might change in the future. If you want to enable this beta feature, set the environment variable RASA_PRO_BETA_REPEAT_COMMAND=true.

Repeat defines the behaviour when a user asks the assistant to repeat the last message.

Example: User asks the assistant to repeat the last message

Bot: Your Stitch Craft order #1377 is on its way. Would you like to track it or need help with something else?

User: Can you repeat that?

Bot: Of course, your Stitch Craft order #1377 is on its way. Would you like to track it or need help with something else?

Example showcasing a user asking the assistant to repeat the last message

User Silence

Beta Feature

This conversation repair pattern only works with Voice Stream channels.

This feature is currently released in a beta version and disabled by default. The feature might change in the future. If you want to enable this beta feature, use the boolean parameter monitor_silence in the Voice Stream Channel configuration.

User silence on a voice call beyond a certain threshold time can trigger the User Silence pattern. The silence threshold can be configured by the slot silence_timeout.

By default, the silence is handled in a three step approach. If the user is silent for a set amount of time, the assistant first repeats its last message. If the silence continues, the assistant asks if the user is still there. If the user remains silent beyond this point, the assistant informs the user that the call will be ended and then hangs up. This behavior ensures the conversation can gracefully handle periods of user silence.

Example: User remains silent during the conversation

Bot: Hi Rahul, your Stitch Craft order #1377 is on its way. Would you like to track it or need help with something else?

User: ...

Bot: Hi Rahul, your Stitch Craft order #1377 is on its way. Would you like to track it or need help with something else?

User: ...

Bot: Hello, are you still there?

User: ...

Bot: It seems you are not there anymore. I will hang up shortly.

Example showcasing the assistant handling user silence

Configurations

Default Behavior

Rasa ships a default behavior for every conversation repair case that works out-of-the-box. Each case is handled through a pattern which is a special flow designed specifically to handle the case:

  • pattern_session_start for starting the session.
  • pattern_repeat_bot_messages for repeating the bot message.
  • pattern_user_silence for handling user silence.

The default patterns in Rasa Pro 3.11 can be found here

Modifying default behaviour

It is possible to override the default behaviour of each conversation repair case by creating a flow with the same name as that of the pattern used to handle the corresponding case, like pattern_session_start. If the pattern uses a default action which needs to be modified, you can override the implementation of the default action by implementing a new custom action and use that custom action in the flow.

info

Make sure the assistant is re-trained after the modification is completed.

Common Modifications

In additional to the modifications available to the text assistants, here are some common modifications that can be done to voice assistants.

Handling Call Start and Call Metadata

New in 3.11

We have unified the call handling patterns across Voice Channel Connectors, all voice channels handle call starts, ends and metadata in a similar manner.

The assistant will receive the message /session_start when the call is picked up along with the call metadata. This intent triggers the Session Start pattern. Here's a customized pattern that sends utter_greet when the call connects:

flows.yml
flows:
pattern_session_start:
description: Flow for starting the conversation
name: pattern session start
nlu_trigger:
- intent: session_start
steps:
- action: utter_greet

The following call metadata is received for Twilio:

  • call_id is the unique call identifier from Twilio (CallSid parameter as sent by Twilio Voice)
  • user_phone is the phone number of the user (Caller)
  • bot_phone is the phone number of the bot (Called)
  • direction is the call direction. It can be either inbound or outbound

Action action_session_start is triggered at the beginning of each Rasa session and it can be used to set certain slots based on this metadata. These slots can be used in your utterances for a dynamic greeting. Here is an example:

from rasa_sdk import Action, Tracker
from rasa_sdk.events import SlotSet
from rasa_sdk.executor import CollectingDispatcher
import logging
logger = logging.getLogger(__name__)
class ActionSessionStart(Action):
def name(self) -> str:
return "action_session_start"
def run(self, dispatcher: CollectingDispatcher, tracker: Tracker,
domain: dict) -> list:
# get the call metadata from the tracker
metadata = tracker.get_slot("session_started_metadata")
logger.info(f"🤙 action_session_start's metadata: {metadata}")
# set appropriate slots
if metadata:
return [
SlotSet("user_phone", metadata.get("user_phone")),
SlotSet("bot_phone", metadata.get("bot_phone")),
]
return []

Handling the End of a Call

Call can be ended by the user or by the assistant.

  • When the call is ended by the user /session_end message is received by the assistant along with a SessionEnded event to the conversation.

  • Assistant flows can use the default action action_hangup to disconnect calls. This action also will add a SessionEnded event to the conversation.

Using Responses relevant to Voice Channels

It is recommended to use channel specific responses with voice channels. This can be done with channel specific responses:

domain.yml
responses:
utter_setup_guide:
- text: "Click the 👉 button or visit https://example.com/setup-guide"
- text: ""To continue setup, open our website and go to the setup guide""
channel: "twilio_media_streams"

In the above example, note that emoji doesn't translate well to speech. URLs are difficult to conprehend due to the temporal nature of voice vs permanence of text. Words like "Type" or "Click" assume interactions that aren't possible on a phone call. Special characters are awkward when spoken.

You can also use SSML in the responses to allow for more customisation in the audio responses from the assistant.

domain.yml
responses:
utter_contact_support:
- text: "Call our support team at 1-800-555-0123"
- text: |
<speak>
You can reach our support team at
<say-as interpret-as="telephone">1 800 555 0123</say-as>
<break time="500ms"/>
Our agents are available 24/7.
</speak>
channel: "twilio_media_streams"