Version Migration Guide

This page contains information about changes between major versions and how you can migrate from one version to another.

Rasa Studio v1.12.x → v1.13.x

What's New

We've made important improvements to Rasa Studio's database migrations:

No more superuser required: In earlier versions, certain database migrations required a user with superuser privileges. This is no longer necessary. All migrations can now be completed using a standard database user.

Before You Upgrade

If you're upgrading from a version before v1.13.x, please follow the below steps.

Step-by-Step Upgrade Instructions

Upgrade to v1.12.7 First

This ensures that all necessary database migrations are applied before moving to the 1.13.x version

Mark Migrations as Complete

After upgrading to v1.12.7, run the following SQL command on your Studio database:

insert into public._prisma_migrations (
id,
checksum,
finished_at,
migration_name,
started_at,
applied_steps_count)
values (
'08eb97ec-85fa-4578-921e-091d50c4a816',
'c0993f05c8c4021b096d2d8c78d7f3977e81388ae36e860387eddb2c3553a65b',
now(),
'000000000000_squashed_migrations',
now(),
1);

This tells the system the earlier migrations are already applied, so they won't run again.

Upgrade to v1.13.x or Later

After completing the steps above, you're ready to upgrade to the latest version of Rasa Studio.

Rasa Pro 3.12 to Rasa Pro 3.13

LLM Judge Model Change in E2E Testing

Starting with Rasa Pro v3.13.x, the default model for the LLM Judge in E2E tests has changed from gpt-4o-mini to gpt-4.1-mini, see Generative Response LLM Judge Configuration. The new model may produce lower scores for the generative_response_is_relevant and generative_response_is_grounded assertions, which can cause previously passing responses to be incorrectly marked as failures (false negatives).

Action Required:

Lower the thresholds for generative_response_is_relevant and generative_response_is_grounded in your E2E test configuration to reduce the risk of false negatives.
Alternatively, if you prefer not to lower the thresholds, configure the LLM Judge to use a more performant model (note: this may increase costs). For details on configuring the LLM Judge, see the E2E testing documentation.

Rasa Pro 3.11 to Rasa Pro 3.12

Custom LLM-based Command Generators

In order to improve slot filling in CALM and allow all types of command generators to issue commands at every conversation turn, we have made the following changes which you should consider to benefit from the new CALM slot filling improvements:

added a new method _check_commands_overlap to the base class CommandGenerator. This method checks if the commands issued by the current command generator overlap with the commands issued by other command generators. This method returns the final deduplicated commands. This method is called by the predict_commands method of the CommandGenerator children classes.
added two new methods _check_start_flow_command_overlap and _filter_slot_commands to the base class CommandGenerator that will raise NotImplementedError if not implemented by the child class. These methods are already implemented by the LLMBasedCommandGenerator and NLUCommandAdapter classes to uphold the prioritization system of the commands.
added a new method _get_prior_commands to the base class CommandGenerator. This method returns a list of commands that have been issued by other command generators prior to the one currently running. This method is called by the predict_commands method of any command generators that inherit from the CommandGenerator class. This prior commands can be either returned in case of an empty tracker or flows, or included to the newly issued commands. For example:

prior_commands = self._get_prior_commands(tracker)
if tracker is None or flows.is_empty():
    return prior_commands

# custom command generation logic block

return self._check_commands_overlap(prior_commands, commands)

added a new method _should_skip_llm_call to the LLMBasedCommandGenerator. This method returns True only if minimize_num_calls is set to True and either prior commands contain a StartFlow command or a SetSlot command for the slot that is requested by an active collect flow step. This method is called by the predict_commands method of the LLMBasedCommandGenerator children classes. If the method returns True, the LLM call is skipped and the method returns the prior commands.
moved the _check_commands_against_slot_mappings static method from the CommandGenerator to the LLMBasedCommandGenerator class. This method is used to check if the issued LLM commands are relevant to the slot mappings. The method is called by the predict_commands method of the LLMBasedCommandGenerator children classes.

Migration from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s

It is recommended to use the new CompactLLMCommandGenerator with optimized prompts for the gpt-4o-2024-11-20 and claude-sonnet-3.5-20240620 models. Using the CompactLLMCommandGenerator can significantly reduce costs - approximately 10 times, according to our tests.

If you've built a custom command generator that extends SingleStepLLMCommandGenerator, we recommend migrating to the new command generator by inheriting the class from CompactLLMCommandGenerator.

# Old class definition:
from rasa.dialogue_understanding.generators import SingleStepLLMCommandGenerator

class MyCommandGenerator(SingleStepLLMCommandGenerator):
    ...

# New class definition:
from rasa.dialogue_understanding.generators import CompactLLMCommandGenerator

class MyCommandGenerator(CompactLLMCommandGenerator):
    ...

Migration from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s with the custom commands

If yo've built a custom command generator that extends SingleStepLLMCommandGenerator and you've defined new commands or overridden Rasa's default commands, you should:

Update the parse_commands method to reflect the changes in the command parsing logic.
Update the custom command classes so they are compatible with the latest command interface. For details on updating and implementing custom command classes, please refer to How to customize existing commands section.

In the new implementation, command parsing has been delegated to a dedicated parsing utility method parse_commands which can be imported from rasa.dialogue_understanding.generator.command_parser This method handles the parsing of the predicted LLM output into commands more effectively and flexibly, especially when using customized or newly introduced command types.

Here is the new recommended pattern for your command generator's parse_commands method:

# Import the utility method under a different name to prevent confusion with the
# command generator's `parse_commands`
from rasa.dialogue_understanding.generator.command_parser import (
    parse_commands
    as parse_commands_using_command_parsers,
)

class CustomCommandGenerator(CompactLLMCommandGenerator):
    """Custom implementation of the LLM command generator."""

    ...

    @classmethod
    def parse_commands(
        cls, actions: Optional[str], tracker: DialogueStateTracker, flows: FlowsList
    ) -> List[Command]:
        """Parse the actions returned by the LLM into intents and entities as commands.

        Args:
            actions: The actions returned by the LLM.
            tracker: The tracker containing the current state of the conversation.
            flows: The current list of active flows.

        Returns:
            The parsed commands.
        """
        commands = parse_commands_using_command_parsers(
            actions,
            flows,
            # Register any custom command classes you have created here
            additional_commands=[CustomCommandClass1, CustomCommandClass2, ...],
            # If your custom command classes replaces or extends default commands,
            # specify the defaults commands for removal here
            default_commands_to_remove=[HumandHandoffCommand, ...]
        )

        if not commands:
            structlogger.warning(
                f"{cls.__name__}.parse_commands",
                message="No commands were parsed from the LLM actions.",
                actions=actions,
            )

        return commands

Migration of the custom prompt from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s

If you've customized the default prompt template previously used with the SingleStepLLMCommandGenerator and are now migrating to the CompactLLMCommandGenerator, you must update this template to use the new prompt commands syntax. This updated command syntax is specifically optimized for the capabilities of the new CompactLLMCommandGenerator.

For more details on the new prompt, refer to the documentation here

Update to `utter_corrected_previous_input` default utterance

The text of the default utter_corrected_previous_input utterance has been updated to use a new correction frame context property context.new_slot_values instead of context.corrected_slots.values. The new utterance is:

"Ok, I am updating {{ context.corrected_slots.keys()|join(', ') }} to {{ context.new_slot_values | join(', ') }} respectively."

LLM Judge Config Format Change in E2E Testing

The custom configuration of the LLM Judge used by E2E testing with assertions has been updated to use the llm_judge key which follows the same structure as other generative components in Rasa Pro. This can either use model groups configuration or the individual model configuration option. The llm_judge key can be used in the conftest.yml file as shown below:

llm_judge:
  llm:
    provider: "openai"
    model: "gpt-4-0613"
  embeddings:
    provider: "openai"
    model: "text-embedding-ada-002"

`action` property in custom slot mapping replaced with `run_action_every_turn`

With the deprecation of the custom slot mapping in favor of the new controlled mapping type, the action property associated with the custom slot mapping has been replaced with the run_action_every_turn property. For this reason, if you prefer not to run these custom actions at every turn, it is recommended you remove the action property from your slot mappings.

Rasa Pro 3.9 to Rasa Pro 3.10

LLM/Embedding Configuration

The LLM and embedding configurations have been updated to use the provider key instead of the type key. These changes apply to all providers, with some examples provided for reference.

Cohere

llm:
  provider: "cohere" # instead of "type: cohere"
  model: "command-r"

Vertex AI

llm:
  provider: "vertex_ai" # instead of "type: vertexai"
  model: "gemini-pro"

Hugging Face Hub

llm:
  provider: "huggingface" # instead of "type: huggingface_hub"
  model: "HuggingFaceH4/zephyr-7b-beta" # instead of "repo_id: HuggingFaceH4/zephyr-7b-beta"

llama.cpp

The support for loading models directly have been removed. You need to deploy the model to a server and use the server URL to load the model. For instance a llama.cpp server can be run using the following command, ./llama-server -m your_model.gguf --port 8080.

For more information on llama.cpp server, refer to the llama.cpp documentation The assistant can be configured as:

llm:
  provider: "self-hosted" # instead of "type: llamacpp"
  api_base: "http://localhost:8000/v1" # instead of "model_path: "/path/to/model.bin""
  model: "ggml-org/Meta-Llama-3.1-8B-Instruct-Q4_0-GGUF"

vLLM

The model can be deployed and served through vLLM==0.6.0. For instance a vLLM server can be run using the following command, vllm serve your_model

For more information on vLLM server, refer to the vLLM documentation The assistant can be configured as:

llm:
  provider: "self-hosted" # instead of "type: vllm_openai"
  api_base: "http://localhost:8000/v1"
  model: "NousResearch/Meta-Llama-3-8B-Instruct"  # the name of the model you have deployed

note

CALM exclusively utilizes the chat completions endpoint of the model server, so it's essential that the model's tokenizer includes a chat template. Models lacking a chat template will not be compatible with CALM anymore.

Backward compatibility has been maintained for OpenAI and Azure configurations. For all other providers, ensure the use of the provider key and review the configuration against the documentation.

Disabling the cache

For Rasa Pro versions <= 3.9.x, the correct way to disable the cache was:

llm:
    model: ...
    cache: false

Rasa Pro 3.10.0 onwards, this has changed since we rely on LiteLLM to manage caching. To avoid errors, change your configuration to -

llm:
    model: ...
    cache:
       no-cache: true

Custom Components using an LLM

As of Rasa Pro 3.10, the backend for sending LLM and Embedding API requests has undergone a significant change. The previous LangChain version 0.0.329 has been replaced with LiteLLM.

This shift can potentially break custom implementations of components that configure and send API requests to chat completion and embedding endpoints. Specifically, the following components are impacted:

If your project contains custom components based on any of the affected components listed above, you will need to verify and possibly refactor your code to ensure compatibility with LiteLLM.

Changes to `llm_factory`

The llm_factory is used across all components that configure and send API requests to an LLM. Previously, the llm_factory relied on LangChain's mapping to instantiate LangChain clients.

Rasa Pro 3.10 onwards, the llm_factory returns clients that conform to the new LLMClient protocol. This impacts any custom component that was previously relying on LangChain types.

If you have overridden components, such as a command generator, you will need to update your code to handle the new return type of LLMClient. This includes adjusting method calls and ensuring compatibility with the new protocol.

The following method calls will need to be adjusted if you have overridden them:

SingleStepLLMCommandGenerator.invoke_llm
MultiStepLLMCommandGenerator.invoke_llm
ContextualResponseRephraser.rephrase
EnterpriseSearchPolicy.predict_action_probabilities
IntentlessPolicy.generate_answer
LLMBasedRouter.predict_commands

Here's an example of how to update your code:

Rasa 3.9 - LangChain
Rasa 3.10 - LiteLLM

from rasa.shared.utils.llm import llm_factory

# get the llm client via factory
llm = llm_factory(config, default_config)

# get the llm response synchronously
sync_completion: str = llm.predict(prompt)

# get the llm response asynchronously
async_completion: str = await llm.apredict(prompt)

from rasa.shared.utils.llm import llm_factory
from rasa.shared.providers.llm.llm_client import LLMClient
from rasa.shared.providers.llm.llm_response import LLMResponse

# get the llm client via factory
llm: LLMClient = llm_factory(config, default_config)

# get the llm response synchronously
sync_response: LLMResponse = llm.completion(prompt)  # or llm.completion([prompt_1, prompt_2,..., prompt_n])
sync_completion: str = sync_response.choices[0]

# get the llm response asynchronously
async_response: LLMResponse = await llm.acompletion(prompt)  # or llm.acompletion([prompt_1, prompt_2,..., prompt_n])
async_completion: str = async_response.choices[0]

Changes to `embedder_factory`

The embedder_factory is used across all components that configure and send API requests to an embedding model. Previously, the embedder_factory returned LangChain's embedding clients of Embeddings type.

Rasa Pro 3.10 onwards, the embedder_factory returns clients that conform to the new EmbeddingClient protocol. This change is part of the move to LiteLLM, and it impacts any custom components that were previously relying on LangChain types.

If you have overridden components that rely on instantiating clients with embedder_factory you will need to update your code to handle the new return type of EmbeddingClient. This includes adjusting method calls and ensuring compatibility with the new protocol.

The following method calls will need to be adjusted if you have overridden them:

FlowRetrieval.load
FlowRetrieval.populate
EnterpriseSearchPolicy.load
EnterpriseSearchPolicy.train
IntentlessPolicy.load
Or if you have overridden the IntentlessPolicy.embedder attribute.

Here's an example of how to update your code:

Rasa 3.9 - LangChain
Rasa 3.10 - LiteLLM

from rasa.shared.utils.llm import embedder_factory

# get the embedding client via factory
embedder = embedder_factory(config, default_config)

# get the embedding response synchronously
vectors: List[List[float]] = embedder.embed_documents([doc_1, doc_2])

# get the embedding response asynchronously
vectors: List[List[float]] = await embedder.aembed_documents([doc_1, doc_2])

from rasa.shared.utils.llm import embedder_factory
from rasa.shared.providers.embedding.embedding_client import EmbeddingClient
from rasa.shared.providers.embedding.embedding_response import EmbeddingResponse

# get the embedding client via factory
embedder: EmbeddingClient = embedder_factory(config, default_config)

# get the embedding response synchronously
sync_response: EmbeddingResponse = embedder.embed([doc_1, doc_2])
vectors: List[List[float]] = sync_response.data

# get the embedding response asynchronously
async_response: EmbeddingResponse = await embedder.aembed([doc_1, doc_2])
vectors: List[List[float]] = async_response.data

Changes to `invoke_llm`

The previous implementation of invoke_llm method in SingleStepLLMCommandGenerator, MultiStepLLMCommandGenerator, and the deprecated LLMCommandGenerator used llm_factory to instantiate LangChain clients. Since the factory now returns clients that conform to the new LLMClient protocol, any custom overrides of the invoke_llm method will need to be updated to accommodate the new return type.

Below you can find the invoke_llm method from Rasa Pro 3.9 and its updated version in Rasa Pro 3.10:

Rasa 3.9
Rasa 3.10

async def invoke_llm(self, prompt: Text) -> Optional[Text]:
    """Use LLM to generate a response.

    Args:
        prompt: The prompt to send to the LLM.

    Returns:
        The generated text.

    Raises:
        ProviderClientAPIException if an error during API call.
    """
    llm = llm_factory(self.config.get(LLM_CONFIG_KEY), DEFAULT_LLM_CONFIG)
    try:
        return await llm.apredict(prompt)
    except Exception as e:
        structlogger.error("llm_based_command_generator.llm.error", error=e)
        raise ProviderClientAPIException(
            message="LLM call exception", original_exception=e
        )

async def invoke_llm(self, prompt: Text) -> Optional[Text]:
    """Use LLM to generate a response.

    Args:
        prompt: The prompt to send to the LLM.

    Returns:
        The generated text.

    Raises:
        ProviderClientAPIException if an error during API call.
    """
    llm = llm_factory(self.config.get(LLM_CONFIG_KEY), DEFAULT_LLM_CONFIG)
    try:
        llm_response = await llm.acompletion(prompt)
        return llm_response.choices[0]
    except Exception as e:
        structlogger.error("llm_based_command_generator.llm.error", error=e)
        raise ProviderClientAPIException(
            message="LLM call exception", original_exception=e
        )

Changes to `SingleStepLLMCommandGenerator.predict_commands`

For SingleStepLLMCommandGenerator, the predict_commands method now includes a call to self._update_message_parse_data_for_fine_tuning(message, commands, flow_prompt). This function is essential for enabling the fine-tuning recipe.

If you have overridden the predict_commands method, you need to manually add this call to ensure proper functionality:

async def predict_commands(
        self,
        message: Message,
        flows: FlowsList,
        tracker: Optional[DialogueStateTracker] = None,
        **kwargs: Any,
    ) -> List[Command]:

    ...
    action_list = await self.invoke_llm(flow_prompt)
    commands = self.parse_commands(action_list, tracker, flows)

    self._update_message_parse_data_for_fine_tuning(message, commands, flow_prompt)

    return commands

Changes to the default configuration dictionary

The default configurations for the following components have been updated:

If you have custom implementations based on the default configurations for any of these components, ensure that your configuration dictionary aligns with the updates shown in the tables below, as the defaults have changed.

Default LLM configuration keys have been updated from:

DEFAULT_LLM_CONFIG = {
    "_type": "openai",
    "model_name": ...,
    "request_timeout": ...,
    "temperature": ...,
    "max_tokens": ...,
}

to:

DEFAULT_LLM_CONFIG = {
    "provider": "openai",
    "model": ...,
    "temperature": ...,
    "max_tokens": ...,
    "timeout": ...,
}

Similarly, default embedding configuration keys have been updated from:

DEFAULT_EMBEDDINGS_CONFIG = {
    "_type": "openai",
    "model": ...,
}

to:

DEFAULT_EMBEDDINGS_CONFIG = {
    "provider": "openai",
    "model": ...,
}

Be sure to update your custom configurations to reflect these changes in order to ensure continued functionality.

Dropped support for Python 3.8

Dropped support for Python 3.8 ahead of Python 3.8 End of Life in October 2024.

In Rasa Pro versions 3.10.0, 3.9.11 and 3.8.13, we needed to pin the TensorFlow library version to 2.13.0rc1 in order to remove critical vulnerabilities; this resulted in poor user experience when installing these versions of Rasa Pro with uv pip. Removing support for Python 3.8 will make it possible to upgrade to a stabler version of TensorFlow.

Rasa Pro 3.8 to Rasa Pro 3.9

LLMCommandGenerator

Starting from Rasa Pro 3.9 the former LLMCommandGenerator is replaced by SingleStepLLMCommandGenerator. The LLMCommandGenerator is now deprecated and will be removed in version 4.0.0.

The SingleStepLLMCommandGenerator differs from the LLMCommandGenerator in how it handles failures of the invoke_llm method. Specifically, if the invoke_llm method call fails in SingleStepLLMCommandGenerator, it raises a ProviderClientAPIException. In contrast, the LLMCommandGenerator simply returns None when the method call fails.

Slot Mappings

In case you had been using custom slot mapping type for slots set with the prediction of the LLM-based command generator, you need to update your assistant's slot configuration to use the new from_llm slot mapping type. Note that even if you have written custom slot validation actions (following the validate_<slot_name> convention) for slots set by the LLM-based command generator, you need to update your assistant's slot configuration to use the new from_llm slot mapping type.

For slots that are set only via a custom action e.g. slots set by external sources only, you must add the action name to the slot mapping:

slots:
  slot_name:
    type: text
    mappings:
      - type: custom
        action: custom_action_name

Rasa Pro 3.8.0 to Rasa Pro 3.8.1

Poetry Installation

Starting from Rasa Pro 3.8.1 in the 3.8.x minor series, we have upgraded the version Poetry for managing dependencies in the Rasa Pro Python package to 1.8.2. To install the latest micro versions of Rasa Pro in your project, you must first upgrade Poetry to version 1.8.2:

poetry self update 1.8.2

Rasa Pro 3.7 to 3.8

info

Starting from 3.8.0, Rasa and Rasa Plus have been merged into a single artifact, named Rasa Pro.

Installation

Following the merge we renamed the resulting python package and Docker image to rasa-pro.

Python package

Rasa Pro python package, for 3.8.0 and onward, is located at:

https://europe-west3-python.pkg.dev/rasa-releases/rasa-pro-python

Name of the package is rasa-pro.

Example of how to install the package:

pip install  --extra-index-url=https://europe-west3-python.pkg.dev/rasa-releases/rasa-pro-python/simple rasa-pro==3.8.0

While python package name was changed, the import process remains the same:

import rasa.core

from rasa import train

For more information on how to install Rasa Pro, please refer to the Python installation guide.

Helm Chart / Docker Image

Rasa Pro docker image, for 3.8.0 and onward, is located at:

europe-west3-docker.pkg.dev/rasa-releases/rasa-pro/rasa-pro

Example how to pull the image:

docker pull europe-west3-docker.pkg.dev/rasa-releases/rasa-pro/rasa-pro:3.8.0

For more information on how to install Rasa Pro Docker image, please refer to the Docker installation guide.

Component Yaml Configuration Changes

Follow the below instructions to update the configuration of Rasa Pro components in the 3.8 version:

ConcurrentRedisLockStore - update endpoints.yml to type: concurrent_redis:

lock_store:
  type: concurrent_redis

ContextualResponseRephraser- update endpoints.yml to either type: rephrase or type: rasa.core.ContextualResponseRephraser:

nlg:
  type: rephrase

Audiocodes and Vier CVG channels can be specified in credentials.yml using directly their channel name:

audiocodes:
  token: "sample_token"

vier_cvg:
  ...

EnterpriseSearchPolicy and IntentlessPolicy - update config.yml to only use the policy class name:

    policies:
      - name: EnterpriseSearchPolicy
      - name: IntentlessPolicy

Changes to default behaviour

info

With Rasa Pro 3.8, we introduced a couple of changes that rectifies the default behaviour of certain components. We believe these changes align better with the principles of CALM. If you are migrating an assistant built with Rasa Pro 3.7, please ensure you have checked if these changes affect your assistant.

Prompt Rendering

Rasa Pro 3.8 introduces a new feature flow-retrieval which ensures that only the flows that are relevant to the conversation context are included in the prompt sent to the LLM in the LLMCommandGenerator. This helps the assistant scale to a higher number of flows and also reduces the LLM costs.

This feature is enabled by default and we recommend to use it if the assistant has more than 40 flows. By default, the feature uses embedding models from OpenAI, but if you are using a different provider (for e.g. Azure), please ensure -

An embedding model is configured with the provider.
LLMCommandGenerator has been configured correctly to connect to the embedding provider. For example, see the section on configuration required to connect to Azure OpenAI service

If you wish to disable the feature you can configure the LLMCommandGenerator as:

config.yml
pipeline:
  - name: SingleStepLLMCommandGenerator
    ...
    flow_retrieval:
      active: false
    ...

Processing Chitchat

The default behaviour in Rasa Pro 3.7 to handle chitchat utterances was to rely on free form generative responses. This can lead to the assistant sending unwanted responses or responding to out of scope user utterances. The new default behaviour in Rasa Pro 3.8 is to rely on IntentlessPolicy to respond to chitchat utterances using pre-defined responses only.

If you were relying on free form generative responses to handle chitchat in Rasa Pro 3.7, you will now see a warning message when you train the same assistant with Rasa Pro 3.8 - " pattern_chitchat has an action step with action_trigger_chitchat, but IntentlessPolicy is not configured". This appears because the default definition of pattern_chitchat has been modified in Rasa Pro 3.8 to:

pattern_chitchat:
  description: handle interactions with the user that are not task-oriented
  name: pattern chitchat
  steps:
    - action: action_trigger_chitchat

For the assistant to be able to handle chitchat utterances, you have two options:

If you are happy with free-form generative responses for such user utterances, then you can override pattern_chitchat to:

pattern_chitchat:
  description: handle interactions with the user that are not task-oriented
  name: pattern chitchat
  steps:
    - action: utter_free_chitchat_response

If you want to switch to using pre-defined responses, you should first add IntentlessPolicy to the policies section of the config -
```
policies:
  - name: IntentlessPolicy
```

Next, you should add response templates for the pre-defined responses you want the assistant to consider when responding to a chitchat user utterance.

Handling of categorical slots

Rasa Pro versions <= 3.7.8 used to store the value of a categorical slot in the same casing as it was either specified in the user message or predicted by the LLM in a SetSlot command. This wasn't necessarily same as the casing used in the corresponding possible value defined for that slot in the domain. For e.g, if the categorical slot was defined to have [A, B, C] as the possible values and the prediction was to set it to a then the slot would be set to a. This lead to problems downstream when that slot had to be used in other primitives i.e. flows or custom action.

Rasa Pro 3.7.9 fixes this by always storing the slot value in the same casing as defined in the domain. So, in the above example, the slot would now be stored as A instead of a. This ensures that the user is writing business logic for slot comparisons, for e.g. if conditions in flows, using the same casing as defined by them in the domain.

If you are migrating from Rasa pro versions <= 3.7.8, please double check your flows and custom actions to make sure none of them break because of this change.

Update default signature of LLM calls

In Rasa Pro >= 3.8 we switched from doing synchronous LLM calls to asynchronous calls. We updated all components that use an LLM, e.g.

LLMCommandGenerator
ContextualResponseRephraser
EnterpriseSearchPolicy
IntentlessPolicy

This can potentially break assistants migrating to 3.8 that have sub-classed one of these components in their own custom components.

For example, the method predict_commands in the LLMCommandGenerator is now async and needs to await the methods _generate_action_list_using_llm and flow_retrieval.filter_flows as these methods are also async. For more information on asyncio please check their documentation.

Dependency Upgrades

We've updated our core dependencies to enhance functionality and performance across our platform.

Spacy 3.7.x

Upgraded from >=3.6 to >=3.7.

We have transitioned to using Spacy version 3.7.x to benefit from the latest enhancements in natural language processing. If you're using any spacy models with your assistant, please update them to Spacy 3.7.x compatible models.

Pydantic 2.x

Upgraded from >=1.10.9,<1.10.10 to ^2.0.

Along with the Spacy upgrade, we have moved to Pydantic version 2.x, which necessitates updates to Pydantic models. For assistance with updating your models, please refer to the Pydantic Migration Guide. This ensures compatibility with the latest improvements in data validation and settings management.

Rasa Pro 3.7.9 to Rasa Pro 3.7.10

Poetry Installation

Starting from Rasa Pro 3.7.10 in the 3.7.x minor series, we have upgraded the version Poetry for managing dependencies in the Rasa Pro Python package to 1.8.2. To install Rasa Pro in your project, you must first upgrade Poetry to version 1.8.2:

poetry self update 1.8.2

Rasa Pro 3.7.8 to Rasa Pro 3.7.9

Changes to default behaviour

Handling of categorical slots

If you are migrating from Rasa pro versions <= 3.7.8, please double check your flows and custom actions to make sure none of them break because of this change.

Rasa 3.6 to Rasa Pro 3.7

Installation

info

Starting from Rasa 3.7.0, Rasa has moved to a new package registry and Docker registry. You will need to update your package registry to install Rasa 3.7.0 and later versions. If you are a Rasa customer, please reach out to your Rasa account manager or support obtain a license.

Python package

Rasa python package for 3.7.0 has been moved to python package registry.

https://europe-west3-python.pkg.dev/rasa-releases/rasa-plus-py

Name of the package is rasa.

Example of how to install the package:

pip install  --extra-index-url=https://europe-west3-python.pkg.dev/rasa-releases/rasa-plus-py/simple rasa==3.7.0

For more information on how to install Rasa Pro, please refer to the Python installation guide.

Helm Chart / Docker Image

Rasa docker image for 3.7.0 is located at:

europe-west3-docker.pkg.dev/rasa-releases/rasa-docker/rasa

Example how to pull the image:

docker pull europe-west3-docker.pkg.dev/rasa-releases/rasa-docker/rasa:3.7.0

For more information on how to install Rasa Pro Docker image, please refer to the Docker installation guide.

Migrating from older versions

For migrating from Rasa Open Source versions, please refer to the migration guide.

Rasa Studio v1.12.x → v1.13.x​

What's New​

Before You Upgrade​

Step-by-Step Upgrade Instructions​

Rasa Pro 3.12 to Rasa Pro 3.13​

LLM Judge Model Change in E2E Testing​

Rasa Pro 3.11 to Rasa Pro 3.12​

Custom LLM-based Command Generators​

Migration from SingleStepLLMCommandGenerators to the CompactLLMCommandGenerators​

Migration from SingleStepLLMCommandGenerators to the CompactLLMCommandGenerators with the custom commands​

Migration of the custom prompt from SingleStepLLMCommandGenerators to the CompactLLMCommandGenerators​

Update to utter_corrected_previous_input default utterance​

LLM Judge Config Format Change in E2E Testing​

action property in custom slot mapping replaced with run_action_every_turn​

Rasa Pro 3.9 to Rasa Pro 3.10​

LLM/Embedding Configuration​

Disabling the cache​

Custom Components using an LLM​

Changes to llm_factory​

Changes to embedder_factory​

Changes to invoke_llm​

Changes to SingleStepLLMCommandGenerator.predict_commands​

Changes to the default configuration dictionary​

Dropped support for Python 3.8​

Rasa Pro 3.8 to Rasa Pro 3.9​

LLMCommandGenerator​

Slot Mappings​

Rasa Pro 3.8.0 to Rasa Pro 3.8.1​

Poetry Installation​

Rasa Pro 3.7 to 3.8​

Installation​

Python package​

Helm Chart / Docker Image​

Component Yaml Configuration Changes​

Changes to default behaviour​

Prompt Rendering​

Processing Chitchat​

Handling of categorical slots​

Update default signature of LLM calls​

Dependency Upgrades​

Spacy 3.7.x​

Pydantic 2.x​

Rasa Pro 3.7.9 to Rasa Pro 3.7.10​

Poetry Installation​

Rasa Pro 3.7.8 to Rasa Pro 3.7.9​

Changes to default behaviour​

Handling of categorical slots​

Rasa 3.6 to Rasa Pro 3.7​

Installation​

Python package​

Helm Chart / Docker Image​

Migrating from older versions​

Rasa Studio v1.12.x → v1.13.x

What's New

Before You Upgrade

Step-by-Step Upgrade Instructions

Rasa Pro 3.12 to Rasa Pro 3.13

LLM Judge Model Change in E2E Testing

Rasa Pro 3.11 to Rasa Pro 3.12

Custom LLM-based Command Generators

Migration from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s

Migration from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s with the custom commands

Migration of the custom prompt from `SingleStepLLMCommandGenerator`s to the `CompactLLMCommandGenerator`s

Update to `utter_corrected_previous_input` default utterance

LLM Judge Config Format Change in E2E Testing

`action` property in custom slot mapping replaced with `run_action_every_turn`

Rasa Pro 3.9 to Rasa Pro 3.10

LLM/Embedding Configuration

Disabling the cache

Custom Components using an LLM

Changes to `llm_factory`

Changes to `embedder_factory`

Changes to `invoke_llm`

Changes to `SingleStepLLMCommandGenerator.predict_commands`

Changes to the default configuration dictionary

Dropped support for Python 3.8

Rasa Pro 3.8 to Rasa Pro 3.9

LLMCommandGenerator

Slot Mappings

Rasa Pro 3.8.0 to Rasa Pro 3.8.1

Poetry Installation

Rasa Pro 3.7 to 3.8

Installation

Python package

Helm Chart / Docker Image

Component Yaml Configuration Changes

Changes to default behaviour

Prompt Rendering

Processing Chitchat

Handling of categorical slots

Update default signature of LLM calls

Dependency Upgrades

Spacy 3.7.x

Pydantic 2.x

Rasa Pro 3.7.9 to Rasa Pro 3.7.10

Poetry Installation

Rasa Pro 3.7.8 to Rasa Pro 3.7.9

Changes to default behaviour

Handling of categorical slots

Rasa 3.6 to Rasa Pro 3.7

Installation

Python package

Helm Chart / Docker Image

Migrating from older versions