Rasa Pro Infrastructure Requirements

Minimum Hardware Requirements

The minimum requirements suggested to run the Rasa Pro server are:

8 vCPU
16 GB RAM
50 GB Disk

These requirements are for a single instance of Rasa Pro server running CALM (Conversational AI Lifecycle Management) only. These will ensure that your assistant can handle on average 24 concurrent requests (i.e. user messages) per second (RPS) with a response time of between 100 and 500 ms, provided you are using multiple Sanic workers (max 8).

If you are installing on Kubernetes using our Helm charts, you can optionally customise the resource requests and limits for individual pods using the values.yaml file.

Infrastructure Guidelines

We recommend starting off with these baseline recommendations and then scaling up as needed based on your usage patterns.

AWS

Instance Type: c5.2xlarge
Operating System: Ubuntu LTS 22.04 or 24.04
Disk: GP3 SSD EBS volume

Azure

Instance Type: Fsv2_8s_v2
Operating System: Ubuntu LTS 22.04 or 24.04
Disk: Premium SSD

Google Cloud

Instance Type: c2-standard-8
Operating System: Ubuntu LTS 22.04 or 24.04
Disk: Balanced Persistent Disk

info

The above guidelines are the minimum requirements for a CALM-only assistant.

If you are using additional components such as NLU, PII, or local embeddings, that could require using additional models other than the Rasa trained model, you will need to scale up your infrastructure accordingly.

If you choose to use multiple Sanic workers, you should ensure that:

the number of workers does not exceed the number of CPU cores available on your machine.
each worker has enough memory allocated to load any additional models. This is particularly important if you are using large models in custom NLU components, local embedding models or PII management.

Minimum Hardware Requirements​

Infrastructure Guidelines​

AWS​

Azure​

Google Cloud​

Minimum Hardware Requirements

Infrastructure Guidelines

AWS

Azure

Google Cloud