Deployment Environments

Architecture

Rasa X allows multiple versions of an assistant to be run simultaneously and served to different users. By default, two environments are defined:

  • production
  • worker

This page provides a walkthrough of the best practices for managing the lifecycle of a Rasa model.


../../_images/deployment-environments-arch.png

Each environment corresponds to a Rasa server which uses the model with the corresponding tag (e.g. production). The model training happens on the worker instances to ensure the training does not degrade performance.

A model can be given multiple environment tags, but each environment can only have a single model associated with it. If model A is tagged experimental and you promote it to production, the model will now have two tags, and the two environments will use the same model. This ensures that you can promote a model to an environment without affecting any others.

The name and number of environments can be customized when deploying Rasa X.

Note

In a typical web application, the experimental, and production environments are often isolated from each other (e.g. running on separate machines). In a machine-learning application, you want to seamlessly collect training data from your production users, integrate it into the development models, and promote those models into production to get a continuously improving system. For this reason, we recommend running all of the environments on the same cluster.

Workflow

Note

This workflow assumes, you already added an additional deployment environment. If you are only running the production and no experimental environment, you can follow the guide in the next section which explains how to add a new environment.

  1. The first step is to move a trained model into the experimental environment. To do this, first navigate to the models view. Hover over a model, and click the three dots menu which appears on the table row. Now you can select the experimental environment and click ‘Save’.

../../_images/rasa-enterprise-multiple-environments.png

This model will now be used to serve users at the endpoint /conversations/{sender_id}/messages?environment=experimental.

Note

To promote a model to experimental you need to have the environments.experimental.write permission. Check with your admin if you do not have sufficient permissions.

  1. Now that your model is running in the development environment, you can do some testing. Navigate to the ‘Talk to your bot’ view and select ‘experimental’ from the dropdown on the right. If you make any improvements, like adding more training data or updating the response templates, you need to use the ‘train’ button to create a new model and promote that model to experimental.
  2. Once you are satisfied with your model’s performance, repeat step 1 but instead selecting the production environment.

Note

By default, only the admin role has the permission to modify the production environment. You can customize this in Role-Based Access Control

Customizing Deployment Environments

You currently cannot modify your deployment environments via the API or the user interface. In order to customise the environments (e.g. to create a new one), you will have to edit your deployment setup. If you are using docker compose, edit the environments key in your docker-compose.yml file.

Rasa X lets you run multiple Rasa models in parallel, allowing you to test different environments promoting them to production. In this section we’ll show you how to

  1. Launch additional Rasa servers that run your models
  2. Make the additional servers accessible in the /conversations tab
  3. Assign tags to the Rasa models so they’re run by the right server

Starting additional Rasa Servers

By default, Rasa X runs two Rasa servers: one that should be used in production and one worker that executes certain tasks like training and evaluating models. Let’s look at an example where we add a fourth Rasa server in addition to the predefined rasa-production and rasa-worker servers. Let’s call it rasa-experimental. Create a file called docker-compose.override.yml in /etc/rasa containing the following:

version: "3.4"

x-rasa-services: &default-rasa-service
  restart: always
  image: "rasa/rasa:${RASA_VERSION:-stable}-full"
  volumes:
  - ./credentials.yml:/app/credentials.yml
  - ./endpoints.yml:/app/endpoints.yml
  expose:
  - "5005"
  command: >
    x
    --no-prompt
    --production
    --port 5005
    --jwt-method HS256
    --jwt-secret ${JWT_SECRET}
    --auth-token '${RASA_TOKEN}'
    --cors "*"
  depends_on:
  - rasa-x
  - rabbit

x-rabbitmq-credentials: &rabbitmq-credentials
  RABBITMQ_HOST: "rabbit"
  RABBITMQ_USERNAME: "user"
  RABBITMQ_PASSWORD: ${RABBITMQ_PASSWORD}

x-rasax-credentials: &rasax-credentials
  RASA_X_HOST: ${RASA_X_HOST:-http://rasa-x:5002}
  RASA_X_TOKEN: ${RASA_X_TOKEN}
  JWT_SECRET: ${JWT_SECRET}
  RASA_USER_APP: "http://app:5055"

x-rasa-credentials: &rasa-credentials
  <<: *rabbitmq-credentials
  <<: *rasax-credentials
  <<: *database-credentials
  RASA_TOKEN: ${RASA_TOKEN}
  RASA_MODEL_PULL_INTERVALL: 10

x-database-credentials: &database-credentials
  DB_HOST: "db"
  DB_PORT: "5432"
  DB_USER: "${DB_USER:-admin}"
  DB_PASSWORD: "${DB_PASSWORD}"
  DB_LOGIN_DB: "${DB_LOGIN_DB:-rasa}"

Right below, make an entry for the new Rasa service rasa-experimental:

services:
  rasa-experimental:
    <<: *default-rasa-service
    environment:
      <<: *rasa-credentials
      DB_DATABASE: "rasa-experimental"
      RABBITMQ_QUEUE: "rasa_experimental_events"
      RASA_MODEL_SERVER: "http://rasa-x:5002/projects/default/models/tags/experimental"

It’s important that the RABBITMQ_QUEUE variable is unique, and that RASA_MODEL_SERVER requests a unique tag, in this case experimental.

Updating the environments config

We need to let Rasa X know about the newly defined Rasa servers, so that you can talk to the models running on it in the /conversations view. Following the example of our rasa-experimental service, add an entry to your environments settings in the environments.yml file:

rasa:
  production: (...)
  development: (...)
  experimental:
    url: http://rasa-experimental:5005
    token: ${RASA_TOKEN}

The hostname part of url has to match the service name defined in docker-compose.override.yml (in this case rasa-experimental).

Note

By default, all Rasa servers share the same token, but you are free to define a separate token for each service. To achieve this, replace ${RASA_TOKEN} above with your <TOKEN>, and add an entry in the environments section of the new service in docker-compose.override.yml: RASA_TOKEN: "<TOKEN>".

Tagging a model

The final step is to upload a Rasa model and assign the right tag. You can upload models and tag them via the HTTP API. If you have uploaded a model via the Rasa X API, you can also tag it in the interface.