Version: Latest


A Rasa Pro deployment consists of various components. Some are provided by Rasa, others are developed by you, and a number of them are cloud services that need to be provisioned. This page offers a diagram illustrating all these components and how they interact.

The diagram below provides an overview of the Rasa Pro Architecture.


Cloud Services

Your deployed assistant depends on a number of prerequisite cloud services that you need to provision.

Some of these prerequisite cloud services are required (green color), while others are optional (yellow color).

You are highly adviced to use managed cloud services for their stability, performance and backup. There are many options available at each cloud provider, and the table below lists just a few of the available choices.

Model StorageAmazon S3Azure Blob StorageGoogle Cloud Storage
Lock StoreAmazon ElastiCache for RedisAzure Cache for RedisMemorystore for Redis
Tracker StoreAmazon RDS for PostgreSQLAzure Database for PostgreSQLGoogle Cloud SQL for PostgreSQL
Event Queue (Kafka)Amazon Managed Streaming for Apache Kafka (MSK)Azure Kafka ServiceConfluent Cloud on Google Cloud Platform
Secrets ManagerHashiCorp Vault on AWSHashiCorp Vault on AzureHashiCorp Vault with Google Cloud

Note that the NLG Server is an optional cloud service that you have to create and deploy yourself.

Next you will find a short description of the function of each cloud service.

Model Storage

The Model storage is a cloud service where the trained model is stored. Upon initialization or restart, Rasa Pro will download that trained model and read it into memory.

Lock Store

The Lock Store is needed when you have a high-load scenario that requires the Rasa server to be replicated across multiple instances. It ensures that even with multiple servers, the messages for each conversation are handled in the correct sequence without any loss or overlap.

Tracker Store

Your assistant's conversations are stored within a tracker store.

Secrets Manager

The HashiCorp Vault Secrets Manager is integrated with Rasa to securely store and manage sensitive credentials.

NLG Server

The NLG Server in Rasa is used to outsource the response generation and separate it from the dialogue learning process. The benefit of using an NLG Server is that it allows for the dynamic generation of responses without the need to retrain the bot, optimizing workflows by decoupling response text from training data.

Event Queue (Kafka)

The Kafka Event Broker in Rasa is used to stream all events from the Rasa server to a Kafka topic for robust, scalable message handling and further processing.


The rest of this Rasa Pro Architecture section uses the endpoints.yml file as the reference and it is explained how to start the Rasa Pro webserver from the python package directly, via the CLI command:

rasa run --endpoints endpoints.yml

When deploying in your production environment, you deploy the Rasa Pro docker image to a Kubernetes or Openshift cluster via Helm. The same endpoint configurations can be set in the values.yml file for the Rasa Pro helm chart, and passed in via the command:

helm install --values values.yml