You can use the Twilio Voice connector to answer phone calls made to your Twilio phone number or Twilio SIP domain.
New in 2.6
The Twilio Voice connector was added.
Running on Twilio
Connect to a Twilio Phone Number
To forward calls from Twilio to your Rasa assistant the webhook for your phone number needs to be updated. Go to the Phone
Numbers section of your Twilio account and select the phone
number you want to connect to Rasa. Find the
Voice & Fax section on the screen. Under the
A CALL COMES IN section
add the webhook URL (e.g.
https://<host>:<port>/webhooks/twilio_voice/webhook) for your Rasa bot replacing the host
and port with the appropriate values for your deployment. Click
Save at the bottom of the page.
Connect to a Twilio SIP Domain
You can also connect your Twilio Voice channel to a Twilio SIP domain. You can follow the
Twilio Sending SIP guide to forward SIP requests to your Twilio SIP
domain. Once your SIP domain is configured you will have to forward incoming calls to Rasa. In the Twilio console go to
the SIP Domains
section. Select the SIP domain you would like to use. Find the
Call Control Configuration section and add the
webhook URL (e.g.
https://<host>:<port>/webhooks/twilio_voice/webhook) for your Rasa bot replacing the host and port
with the appropriate values for your deployment. Click
Save at the bottom of the page.
Configure Channel in Rasa
credentials.yml file make sure the
twilio_voice channel is added. Within
credentials.yml there are a
number of parameters you can set to control the behavior of your assistant. An example with definitions of each
parameter is below. Unlike the Twilio text channel there is no need to add your Twilio
credentials for the voice channel. Note, changing values for
assistant_voice can result in added costs
from Twilio. Review the documentation below for details about these settings.
When Twilio receives a new call and forwards this to Rasa Open Source, Twilio does not provide a user message. In this
case Rasa Open Source will act as if the user sent the message "Hello". This behavior can be configured in your
credentials.yml file by setting the
initial_prompt parameter to the desired input. The
initial_prompt value will
be sent to Rasa Open Source and the response will be spoken to the user to start the voice conversation. How you greet
a user via a voice channel may differ from a text channel. You should review your responses and consider creating
channel-specific variations where
You can add personality to your assistant by specifying the type of voice your assistant should speak with. In the
credentials.yml file you can add an option for
assistant_voice to specify the type of voice of your assistant. For a
list of supported voices you can check the Twilio documentation. By
woman voice will be used. Note that you will incur additional charges for using any of the
voices. The Twilio documentation has details about
Reprompt Fallback Phrase
Unlike text channels where users can review previous messages in the conversation and take their time to
reply, with voice channels users can get confused when there is a pause in the conversation. When a long pause is
detected during a conversation Rasa will repeat the last utterance from the bot to re-prompt the user for a response. If
the previous bot utterance cannot be identified then the message defined in
reprompt_fallback_phrase will be sent. By
default this is set to "I'm sorry I didn't get that could you rephrase".
How long Twilio will collect speech from the caller. This parameter must be set to a number or "auto". If a number is
provided the assistant will collect speech for the specified amount of time. If
speech_timeout is set to "auto" then
Twilio will collect speech until a pause is detected. The default setting of this parameter is "auto". You can find more
speech_timeout in the
Twilio documentation. Note that
is only compatible with
speech_model='numbers_and_commands'. An error will be raised if an incompatible
speech_model is used with the
speech_model parameter can help Twilio with the accuracy of its speech-to-text transformation. Valid
phone_call. The default setting is "default". You can find more
speech_model in the
true will allow you to use Twilio's premium speech model to improve the accuracy of
transcription results. Note, setting this parameter to
true will result in higher Twilio transcription costs. This
parameter is only supported if you also have set the
speech_model parameter to
phone_call. By default
set to "false". You can find more about the
enhanced speech model option in the
Custom Voice Responses
It is highly recommended that you provide voice channel specific responses for all responses containing images, emojis and/or abbreviations. Visual media like images and emojis do not translate well to voice applications. Having voice alternatives to those responses lets you tailor the user experience to the voice channel. If a response is detected as containing either an image or emoji a warning will be raised. The image or emoji will be removed from the response and only any accompanying text will be included in the response back to the user.
In addition to reviewing responses containing images and emojis you should also review all other responses for their applicability to voice channels. Short-hand abbreviations like "e.g." do not translate well. Voice responses should use the long-hand versions of any abbreviations. Interactions with users can differ between text and voice channels. Reviewing all responses in the scope of a voice channel and providing voice specific responses can make for a better user experience.