Twilio Voice
You can use the Twilio Voice connector to answer phone calls made to your Twilio phone number or Twilio SIP domain.
Running on Twilio
Connect to a Twilio Phone Number
To forward calls from Twilio to your Rasa assistant the webhook for your phone number needs to be updated. Go to the Phone
Numbers section of your Twilio account and select the phone
number you want to connect to Rasa. Find the Voice & Fax
section on the screen. Under the A CALL COMES IN
section
add the webhook URL (e.g. https://<host>:<port>/webhooks/twilio_voice/webhook
) for your Rasa bot replacing the host
and port with the appropriate values for your deployment. Click Save
at the bottom of the page.
Connect to a Twilio SIP Domain
You can also connect your Twilio Voice channel to a Twilio SIP domain. You can follow the
Twilio Sending SIP guide to forward SIP requests to your Twilio SIP
domain. Once your SIP domain is configured you will have to forward incoming calls to Rasa. In the Twilio console go to
the SIP Domains
section. Select the SIP domain you would like to use. Find the Call Control Configuration
section and add the
webhook URL (e.g. https://<host>:<port>/webhooks/twilio_voice/webhook
) for your Rasa bot replacing the host and port
with the appropriate values for your deployment. Click Save
at the bottom of the page.
Configure Channel in Rasa
In your credentials.yml
file make sure the twilio_voice
channel is added. Within credentials.yml
there are a
number of parameters you can set to control the behavior of your assistant. An example with definitions of each
parameter is below. Unlike the Twilio text channel there is no need to add your Twilio
credentials for the voice channel. Note, changing values for enhanced
and assistant_voice
can result in added costs
from Twilio. Review the documentation below for details about these settings.
Parameter Definitions
Initial Prompt
When Twilio receives a new call and forwards this to Rasa, Twilio does not provide a user message. In this
case Rasa will act as if the user sent the message "Hello". This behavior can be configured in your
credentials.yml
file by setting the initial_prompt
parameter to the desired input. The initial_prompt
value will
be sent to Rasa and the response will be spoken to the user to start the voice conversation. How you greet
a user via a voice channel may differ from a text channel. You should review your responses and consider creating
channel-specific variations where
necessary.
Assistant Voice
You can add personality to your assistant by specifying the type of voice your assistant should speak with. In the
credentials.yml
file you can add an option for assistant_voice
to specify the type of voice of your assistant. For a
list of supported voices you can check the Twilio documentation. By
default Twilio's woman
voice will be used. Note that you will incur additional charges for using any of the Polly
voices. The Twilio documentation has details about
Polly pricing.
Reprompt Fallback Phrase
Unlike text channels where users can review previous messages in the conversation and take their time to
reply, with voice channels users can get confused when there is a pause in the conversation. When a long pause is
detected during a conversation Rasa will repeat the last utterance from the bot to re-prompt the user for a response. If
the previous bot utterance cannot be identified then the message defined in reprompt_fallback_phrase
will be sent. By
default this is set to "I'm sorry I didn't get that could you rephrase".
Speech Timeout
How long Twilio will collect speech from the caller. This parameter must be set to a number or "auto". If a number is
provided the assistant will collect speech for the specified amount of time. If speech_timeout
is set to "auto" then
Twilio will collect speech until a pause is detected. The default setting of this parameter is "auto". You can find more
details about speech_timeout
in the
Twilio documentation. Note that speech_timeout="auto"
is only compatible with speech_model='numbers_and_commands'
. An error will be raised if an incompatible
speech_model
is used with the speech_timeout
.
Speech Model
Adjusting the speech_model
parameter can help Twilio with the accuracy of its speech-to-text transformation. Valid
values are default
, numbers_and_commands
, and phone_call
. The default setting is "default". You can find more
details about speech_model
in the
Twilio documentation.
Enhanced
Setting enhanced
to true
will allow you to use Twilio's premium speech model to improve the accuracy of
transcription results. Note, setting this parameter to true
will result in higher Twilio transcription costs. This
parameter is only supported if you also have set the speech_model
parameter to phone_call
. By default enhanced
is
set to "false". You can find more about the enhanced
speech model option in the
Twilio documentation.
Custom Voice Responses
It is highly recommended that you provide voice channel specific responses for all responses containing images, emojis and/or abbreviations. Visual media like images and emojis do not translate well to voice applications. Having voice alternatives to those responses lets you tailor the user experience to the voice channel. If a response is detected as containing either an image or emoji a warning will be raised. The image or emoji will be removed from the response and only any accompanying text will be included in the response back to the user.
In addition to reviewing responses containing images and emojis you should also review all other responses for their applicability to voice channels. Short-hand abbreviations like "e.g." do not translate well. Voice responses should use the long-hand versions of any abbreviations. Interactions with users can differ between text and voice channels. Reviewing all responses in the scope of a voice channel and providing voice specific responses can make for a better user experience.