Rasa Pro Change Log
All notable changes to Rasa Pro will be documented in this page. This product adheres to Semantic Versioning starting with version 3.3 (initial version).
Rasa Pro consists of two deployable artifacts: Rasa Pro and Rasa Pro Services. You can read the change log for both artifacts below.
[3.11.3] - 2025-01-14
Rasa Pro 3.11.3 (2025-01-14)
Improvements
- Enhances YAML parser to validate environment variable resolution for sensitive keys.
Bugfixes
- Add flow yaml validation when using the HTTP API
/model/train
endpoint. An invalid flow yaml will return a 400 response status code with a message describing the error. - Make
pattern_session_start
work withrasa inspector
to allow the assistant proactively start the conversation with a user. - Fix writing the test cases obtained via the e2e test case conversion command to file, where
test_cases
key was written as a list item, instead of a dict key. This caused running the test cases to fail because it didn't comply with the e2e test schema. This PR fixes the issue by writing the test cases as a dict key. - Fixed Inspector's Tracker State view not updating in real-time by moving story fetch logic into WebSocket message handler. Previously, story updates were only triggered on session ID changes, causing stale tracker state after the first conversation turn.
- Add pre-training custom validation to the domain responses that would raise a Rasa Pro validation error when a domain response is an empty sequence.
- Fixes a critical security vulnerability with
jsonpickle
dependency by upgrading to the patched version. - Updated
pymilvus
andminio
to address security vulnerability.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.2] - 2024-12-19
Rasa Pro 3.11.2 (2024-12-19)
Bugfixes
- Validate that
api_type
key is only used for supported providers (Azure and OpenAI). - Enable asserting events returned by
action_session_start
when running end-to-end testing with assertions format. The following assertions can be used:slot_was_set
slot_was_not_set
bot_uttered
bot_did_not_utter
action_executed
- Fixed voice inspector to work with any URL by dynamically constructing WebSocket URL from current domain. This enables voice testing in GitHub Codespaces and other remote environments.
-
- Fixed an error in
rasa llm finetune prepare-data
when using a subclass ofSingleStepLLMCommandGenerator
. - Resolved an issue where
rasa llm finetune prepare-data
did not support model groups.
- Fixed an error in
- Fix AvailableEndpoints to read from the default
endpoints.yaml
, if no endpoint is specified.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.1] - 2024-12-13
Rasa Pro 3.11.1 (2024-12-13)
Bugfixes
- Add the possibility to pass a
transform
callable parameter when writing yaml. This allows passing a custom function to transform endpoints before uploading to Studio. This was required to fix the issue where yaml wraps in quotes any string that doesn't start with an alphabetic character such as unexpanded environment variables in the endpoints yml file. - Pass flow human-readable name instead of flow id when the cancel pattern stack frame is pushed during flow policy validation checks of collect steps.
- Fixed the accuracy calculation to prevent 100% assertion reporting when a test case fails before any assertions are reached.
- Fixed regression on training time for projects with a lot of YAML files.
[3.11.0] - 2024-12-11
Rasa Pro 3.11.0 (2024-12-11)
Deprecations and Removals
- Removed
UnexpecTEDIntentPolicy
from the default config.yml. It is an experimental policy and not suitable for default configuration - The
reset_after_flow_ends
property of collect steps is now deprecated and will be removed in Rasa Pro 4.0.0. Please use thepersisted_slots
property at the flow level instead.
Features
-
Added Twilio Media Streams channel which can be configured to use arbitrary Text-To-Speech and Speech-To-Text services. Added Voice Stream Channel Interface which makes it easier to add voice channels that directly integrate with audio streams. Added support for Deepgram Speech-To-Text and Azure Text-To-Speech in Voice Stream Channels.
-
Added default action
action_hangup
it can be used to hang up a phone call from a flow. AddedSessionEnded
event andSessionEndCommand
command Updated Audiocodes, Jambonz and Twilio Voice channels to send/session_end
if the phone call is disconnected by user. -
Added support for Cartesia Text-To-Speech in Voice Stream Channels.
-
Implement Rasa Pro native model service that takes care of training and running an assistant model in Studio. To find out more about this service, read more in the Studio documentation.
-
Added a feature to be able to use voice to interact with the bot in the inspector.
-
Multi-LLM Routing:
-
Decoupled LLM Configuration from Components
- The previous integration of LLMs within CALM is closely tied to the components where they are used. However, this is no longer necessary, as we no longer perform training within the individual components that interact with external LLM endpoints.
- As a result, LLM and embedding client configurations have been moved to
endpoints.yml
. To define LLM configurations inendpoints.yml
, use themodel_groups
as shown below:model_groups:
- id: gpt-4-direct
models:
- provider: openai
model: gpt-4
timeout: 7
temperature: 0.0
- id: text-embedding-3-small-direct
models:
- provider: openai
model: text-embedding-3-small - These
model_groups
can then be referenced inconfig.yml
as follows:pipeline:
...
- name: SingleStepLLMCommandGenerator
llm:
model_group: gpt-4-direct
flow_retrieval:
embeddings:
model_group: text-embedding-3-smal-direct
...
-
Support for Multiple Subscription Deployments
- Allows customers to use deployments from different subscriptions for the same provider.
- Resolved the limitation of API key configuration being tied exclusively to a single environment variable.
Example configuration in
endpoints.yml
for Azure deployments:model_groups:
- id: azure-gpt-model-eu
models:
- provider: azure
deployment: azure-eu-deployment
api_base: https://api.azure-europe.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_EU}
timeout: 7
temperature: 0.0
...
- id: azure-gpt-model-us
models:
- provider: azure
deployment: azure-us-deployment
api_base: https://api.azure-us.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_US}
timeout: 7
temperature: 0.0
...
... -
Seamless Model Configuration Across Environments Without Retraining
- Added support for using different model configurations in different environments, such as
dev
,staging
, andprod
, without requiring the bot to be retrained for each environment. - Extended the
${...}
syntax todeployment
,api_base
, andapi_version
inmodel_groups
, allowing these values to change dynamically based on the environment.
model_groups:
- id: azure-gpt-4
models:
- provider: azure
deployment: ${AZURE_DEPLOYMENT_GPT4}
api_base: ${AZURE_API_BASE_GPT4}
api_key: ${AZURE_API_KEY_GPT4}
...
- id: azure-text-embeddings-3-small
models:
- provider: azure
deployment: ${AZURE_DEPLOYMENT_EMBEDDINGS_3_SMALL}
api_base: ${AZURE_API_BASE_EMBEDDINGS_3_SMALL}
api_key: ${AZURE_API_EMBEDDINGS_3_SMALL}
... - Added support for using different model configurations in different environments, such as
-
Supporting Multiple Deployments for Load Balancing
- Enabled targeting of multiple LLM deployments for a single Rasa component.
- Implemented the routing feature that supports load balancing to handle rate limits and improve scalability. When multiple models are defined within a model group, you can specify the
router
key with arouting_strategy
to control how requests are distributed among the models.
Example configuration in
endpoints.yml
for Azure deployments with load balancing:model_groups:
- id: azure-gpt-models
models:
- provider: azure
deployment: azure-eu-deployment
api_base: https://api.azure-europe.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_EU}
timeout: 7
temperature: 0.0
...
- provider: azure
deployment: azure-us-deployment
api_base: https://api.azure-us.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_US}
timeout: 7
temperature: 0.0
...
router:
routing_strategy: least-busy
...Example of usage in
config.yml
:pipeline:
...
- name: SingleStepLLMCommandGenerator
llm:
model_group: azure-gpt-models
... -
Backward Compatibility
- Existing configurations that couple LLMs to specific Rasa components remain unaffected by this change.
- However, this configuration method is now deprecated and scheduled for removal in version 4.0.0.
-
-
Added support for Azure Speech-To-Text in Voice Stream Channels.
-
Added
UserSilenceCommand
andpattern_user_silence
which is triggered by Voice Stream channels when the user is silent for more than a silence timeout. These values are configurable with the newly added slotssilence_timeout
andconsecutive_silence_timeouts
. Silence Monitoring is disabled by default and can be enabled using the configurationmonitor_silence: true
in the relevant Voice Stream Channel configuration. -
The inspector is not its own input / output channel anymore. Rather, it can be attached to other channels. This way, it isn't limited to conversations going through the socketio channel anymore, but can be used with other text channels or voice channels.
You can attach it to any channel(s) configured in your credentials.yml by adding a flag to rasa run: rasa run --inspect.
In addition to that, the conenience cli command rasa inspect is retained, which starts the inspector with the socketio channel as usual.
Improvements
-
In Audiocodes channel,
/vaig_event_start
is replaced by/session_start
. This intent marks the beginning of conversation and it is sent when the phone call is connected. -
Introduced the environment variable
MAX_NUMBER_OF_PREDICTIONS_CALM
to configure the CALM-specific limit for the number of predictions. This variable defaults to 1000, providing a higher prediction limit compared to the default value of 10 for nlu-based assistants. -
In Audiocodes and Twilio Voice channel connector, the call metadata received from the providers can be accessed in the slot
session_started_metadata
. The call metadata parameter names have been standardised with CallParameters dataclass Twilio Voice Channel Connector sends/session_start
intent at the beginning of conversation and the channel parameterinitial_prompt
has been removed -
Enable configurability of Vault secret manager's mount point property in the endpoints yaml file or as an environment variable.
-
In Twilio Media Streams channel connector, call metadata is availble in
session_start_metadata
slot. It also supports default actionaction_hangup
-
Catch API connection errors, and validate the correctness of the values present in model configuration at model training time by making a test API request. This feature is enabled by default and can be disabled by setting the environment variable
LLM_API_HEALTH_CHECK
toFalse
. -
Socketio
channel connector now sends the websocket messagestracker_state
andrasa_events
with each bot response.tracker_state
contains the tracker store state at that point in conversation and includes slots, events, stack, latest message and latest action.rasa_events
contains a list of new events that have happened since the last message. -
Speech-To-Text and Text-To-Speech Services can be configured for Voice Stream Channel Connectors Added tests for voice components and redefined code structure
-
Add support for Python 3.11
-
Removed JSON response validation except when HTTP protocol and E2E Stub is used for Custom Action execution.
-
Optimized JSON response validation by initializing the
Draft202012Validator
once and caching it. -
Add an optional property
persisted_slots
at the flow level. This property configures whether slots collected or set across any of the flow steps should be persisted after the flow ends. This property expects a list of slot names. -
Added support for custom Automatic Speech Recognition (ASR) or Text To Speech (TTS) providers to a Rasa Assistant. This allows developers to bring their own speech providers to Rasa by subclassing classes
ASREngine
andTTSEngine
-
If flow retrieval is disabled, a warning is raised only if the number of user flows exceed 20.
-
Added validation to the
TestCase
class to issue a warning when duplicate user messages lack metadata or have incorrect metadata. This enhancement provides clear guidance to users on the issue and how to resolve it. -
Fixed global
should-hangup
variable in Voice Stream Channels by moving to a context variable CallState that stores the session variables -
Run Rasa Pro data validation before uploading to Studio. This is to avoid uploading invalid assistant data that would raise errors during Rasa Pro model training in Studio.
-
Added
vector_name
to Qdrant's configuration to enable customization of the vector field name for storing embeddings. -
Enhanced
YamlValidationException
error messages to include the line number and a relevant YAML snippet showing where the validation error occurred. Line numbers start from 1 (1-based indexing).The error-handling behavior has been modified so that only one validation error is displayed. This exception is raised when the YAML content does not comply with the defined YAML schema.
-
Added a new assertion type
bot_did_not_utter
to allow testing that the bot does not utter specific messages or include certain buttons during conversations. -
Ensure that the model service fails properly if the minimum disk space requirement is not met.
-
Do not expand environment variables when reading yaml files during
rasa studio upload
execution. -
Stream model files to Studio rather than providing full files. Provide a HEAD endpoint for Studio to check if a model is available and what its size is. Add an environment variable to set the port of the model service. This makes the development with Studio easier, previously the port was hard coded making it harder to use a separately deployed model service now that Studio includes that in its development deployment.
-
Add flag
--skip-yaml-validation
to skip YAML validation during Rasa run. User can use it to skip domain YAML validation during Rasa run. Do not instantiate multiple instances of TrainingDataImporter class for validation and training. -
Introduced a
summarize_history
flag for the contextual response rephraser, defaulting toTrue
. When set toFalse
, the conversation transcript instead of the summary is included in the prompt of the contextual response rephraser. This saves a separate summarization call to an LLM. The number of conversation turns to be used whensummarize_history
is set toFalse
can be set viamax_historical_turns
. By default this value is set to 5.Example:
nlg:
- type: rephrase
summarize_history: False
max_historical_turns: 5
Bugfixes
-
Fix OpenAI LLM client ignoring API base and API version arguments if set.
-
Fix
AttributeError
with the instrumentation of therun
method of theCustomActionExecutor
class. -
Throw DuplicatedFlowIdException during
rasa data validate
andrasa train
if there are duplicate flows defined. -
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
-
Avoid filling slots that have
ask_before_filling = True
and utilize afrom_text
slot mapping during other steps in the flow. Ensure that theNLUCommandAdapter
only fills these types of slots when the flow reaches the designated collection step. -
Check for the metadata's
step_id
andactive_flow
keys when adding theActionExecuted
event to the flows paths stack. -
Fixed a bug on Windows where flow files with names starting with 'u' would fail to load due to improper path escaping in YAML content processing
-
Fixes OpenAIException - AsyncClient.init() got an unexpected keyword argument 'proxies'
-
Fix retrieval of model file stored in the cloud storage by the model service. This change consisted in uploading only the model file instead of the full model path during training when
--remote-storage
CLI flag is used. -
Fix issue in e2e testing when customising
action_session_start
would lead to AttributeError, because theoutput_channel
was not set. This is now fixed by setting theoutput_channel
toCollectingOutputChannel()
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.15] - 2024-12-18
Rasa Pro 3.10.15 (2024-12-18)
Bugfixes
- Validate that
api_type
key is only used for supported providers (Azure and OpenAI). - Fix issue in e2e testing when customising
action_session_start
would lead to AttributeError, because theoutput_channel
was not set. This is now fixed by setting theoutput_channel
toCollectingOutputChannel()
. - Fixed the accuracy calculation to prevent 100% assertion reporting when a test case fails before any assertions are reached.
- Pass flow human-readable name instead of flow id when the cancel pattern stack frame is pushed during flow policy validation checks of collect steps.
- Try to instantiate LLM/embeddings client when loading component to validate environment variables.
- Enable asserting events returned by
action_session_start
when running end-to-end testing with assertions format. The following assertions can be used:slot_was_set
slot_was_not_set
bot_uttered
action_executed
[3.10.14] - 2024-12-04
Rasa Pro 3.10.14 (2024-12-04)
Bugfixes
- Avoid filling slots that have
ask_before_filling = True
and utilize afrom_text
slot mapping during other steps in the flow. Ensure that theNLUCommandAdapter
only fills these types of slots when the flow reaches the designated collection step. - Fixes OpenAIException - AsyncClient.init() got an unexpected keyword argument 'proxies'
- Fix validation for LLM/Embedding clients when the api_base is configured in the config itself but not as an environment variable.
[3.10.13] - 2024-11-29
Rasa Pro 3.10.13 (2024-11-29)
Bugfixes
- Implement
eq
andhash
functions forChangeFlowCommand
to fixerror=unhashable type: 'ChangeFlowCommand'
error inMultiStepCommandGenerator
. - Fixed an issue on Windows where flow files with names starting with 'u' would fail to load due to improper path escaping in YAML content processing
- Store the value of the
--disable-verify
CLI flag in thedisable_verify
attribute of theStudioConfig
object, so it can be reused across other studio commands.
[3.10.12] - 2024-11-25
Rasa Pro 3.10.12 (2024-11-25)
Bugfixes
-
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
[3.10.11] - 2024-11-20
Rasa Pro 3.10.11 (2024-11-20)
Bugfixes
- Fix parsing of commands in case the LLM response surrounds flow names, slot names, or slot values with single or double quotes.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.10] - 2024-11-14
Rasa Pro 3.10.10 (2024-11-14)
Bugfixes
- Check for the metadata's
step_id
andactive_flow
keys when adding theActionExecuted
event to the flows paths stack.
[3.10.9] - 2024-11-13
Rasa Pro 3.10.9 (2024-11-13)
Bugfixes
- Introduced the environment variable
MAX_NUMBER_OF_PREDICTIONS_CALM
to configure the CALM-specific limit for the number of predictions. This variable defaults to 1000, providing a higher prediction limit compared to the default value of 10 for nlu-based assistants. - Filter out comments from e2e test input files when writing e2e results to file.
- Specified UTF-8 encoding to correctly read test cases on Windows.
[3.10.8] - 2024-10-24
Rasa Pro 3.10.8 (2024-10-24)
Bugfixes
- The user message "/restart" is now restarting the session again after adding a proper implementation
(stack frame and command) for
pattern_restart
. - Only infer and set the provider to
azure
for our LLM clients in case NOprovider
is specified, but thedeployment
key is set. - Fix OPENAI_API_KEY authentication error when using self-hosted provider.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.7] - 2024-10-17
Rasa Pro 3.10.7 (2024-10-17)
Improvements
- Change default response of
utter_free_chitchat_response
from"placeholder_this_utterance_needs_the_rephraser"
to"Sorry, I'm not able to answer that right now."
.
Bugfixes
- Disallow using the command payload syntax to set slots not filled by any of the active or startable flow(s)
collect
steps. - Add flow name to error message
validator.verify_flows_steps_against_domain.collect_step
. - Update e2e test results output files on each test run so that, for example, when all tests pass on subsequent runs after failing previously, the failed results output file is emptied.
- Disable strict SSL verification to the Rasa Studio authentication server via the
--disable-verify
or-x
CLI argument added to therasa studio config
command. - Upgrade
zipp
dependency version to fix a security vulnerability: CVE-2024-5569.
[3.10.6] - 2024-10-04
Rasa Pro 3.10.6 (2024-10-04)
Bugfixes
- Fix cleanup of
SetSlot
commands issued by the LLM-based command generator for slots that define a slot mapping other than thefrom_llm
slot mapping. The command processor now correctly removes the SetSlot command in these scenarios and instead adds aCannotHandleCommand
. - Fix
UnicodeDecodeError
while reading Windows path from yaml files. - Fix model loading from remote storage by correcting the handling of remote storage enum during the creation of the persistor object.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.5] - 2024-10-01
Rasa Pro 3.10.5 (2024-10-01)
Bugfixes
-
Fix the case where IntentlessPolicy is triggered while no e2e stories were written to guide it. In this situation a CannotHandleCommand will be issued.
-
Update litellm to version 1.45.0 to fix security vulnerability (CVE-2024-6587). Update gitpython to version 3.1.41 to fix security vulnerability (CVE-2024-22190). Update certifi to version 2024.07.04 to fix security vulnerability (CVE-2024-39689).
-
Prevent invalid domain with incorrectly defined intent from throwing stack trace. Throw InvalidDomain exception and send message to the user instead. The message looks like this:
Detected invalid intent definition: {'intent': 'ask_help'}.
Please make sure all intent definitions are valid. -
Support text completions endpoint when using self hosted models.
The
use_chat_completions_endpoint
parameter is now supported when using self-hosted models. This parameter is used to enable the use of the chat completions endpoint when using a self-hosted model. This parameter is set toTrue
by default. To use the text completions endpoint, setuse_chat_completions_endpoint
toFalse
in thellm
section of the component.Usage:
llm:
provider: self-hosted
model: meta-llama/Meta-Llama-3-8B
api_base: "https://my-endpoint/v1"
use_chat_completions_endpoint: false -
Fixes an issue where the
CountVectorsFeaturizer
andLogisticRegressionClassifier
would throw error during inference when no NLU training data is provided. -
Added tracing explicitly to
GRPCCustomActionExecutor.run
in order to pass the tracing context to the action server.
[3.10.4] - 2024-09-25
Rasa Pro 3.10.4 (2024-09-25)
Bugfixes
- Fix failing validation of categorical slots when slot values contain Apostrophe.
[3.10.3] - 2024-09-20
Rasa Pro 3.10.3 (2024-09-20)
No significant changes.
[3.10.2] - 2024-09-19
Rasa Pro 3.10.2 (2024-09-19)
Deprecations and Removals
- Dropped support for Python 3.8 ahead of Python 3.8 End of Life in October 2024. In Rasa Pro versions 3.10.0, 3.9.11 and 3.8.13, we needed to pin the TensorFlow library version to 2.13.0rc1 in order to remove critical vulnerabilities; this resulted in poor user experience when installing these versions of Rasa Pro with
uv pip
. Removing support for Python 3.8 will make it possible to upgrade to a stabler version of TensorFlow.
Improvements
- Update Keras and Tensorflow to version 2.14.
This will eliminate the need to use the
--prerelease allow
flag when installing Rasa Pro usinguv pip
tool.
Bugfixes
-
Revert the old behavior when loading trained model by supplying a path to the model on the remote storage by using the model path (
-m
) argument whenREMOTE_STORAGE_PATH
environment variable is not set. Resulting path on the remote storage will be the same as the model path (-m
) argument.Additionally, entire model path (
-m
) argument wil be used when trained model is being uploaded to the remote storage withREMOTE_STORAGE_PATH
environment variable not set. Resulting path on the remote storage will be the same as the model path (-m
) argument.If
REMOTE_STORAGE_PATH
environment variable is set, only the file name part of the model path (-m
) argument is used in both loading and storage from/to the remote storage. Resulting path on the remote storage will be:REMOTE_STORAGE_PATH
+ file name part of the model path (-m
) argument. -
Fixed UnexpecTEDIntentlessPolicy training errors that resulted from a change to batching behavior. Changed the batching behavior back to the original for all components. Made the changed batching behavior accessible in DietClassifier using
drop_small_last_batch: True
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.1] - 2024-09-11
Rasa Pro 3.10.1 (2024-09-11)
Bugfixes
- Fix OpenAI LLM client ignoring API base and API version arguments if set.
- Fix
FileNotFound
error when runningrasa studio
commands and no pre-existing local assistant project exists. - Fixed telemetry collection for the components Rephraser, LLM Intent Classifier, Intentless Policy and Enterprise Search Policy to ensure that the telemetry data is only collected when it is enabled
- Update the default config for E2E test conversion to use the
provider
key instead ofapi_type
. - Fix inconsistent recording of telemetry events for llm-based command generators.
- Throw deprecation warning when REQUESTS_CA_BUNDLE env var is used.
[3.10.0] - 2024-09-04
Rasa Pro 3.10.0 (2024-09-04)
Deprecations and Removals
- Remove experimental
LLMIntentClassifier
. Use Rasa CALM instead.
Features
-
Implement the shell output of accuracy rate by assertion type as a table when running end-to-end testing with assertions.
-
Implement E2E testing assertions that measure metrics such as grounded-ness and answer relevance of generative responses issued by either Enterprise Search or the Contextual Response Rephraser.
You must specify a threshold which must be reached for the generative evaluation assertion to pass. In addition, you can also specify
ground_truth
if you prefer providing this in the E2E test rather than relying on the retrieved context from the vector store (in the case of Enterprise Search) or from the domain (in the case of Contextual Response Rephraser) that is stored in the bot utterance event metadata. For rephrased answers, you must specifyutter_name
to run the assertion.These assertions can be specified for user steps only and cannot be used alongside the former E2E test format. You can learn more about this new feature in the documentation sections for grounded and relevant assertion types.
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_ASSERTIONS
totrue
.export RASA_PRO_BETA_E2E_ASSERTIONS=true
-
You can now produce a coverage report of your e2e tests via the following command:
rasa test e2e <e2e-test-folder> --coverage-report [--coverage-output-path <output-folder>]
The coverage report contains the number of steps and the number of tested steps per flow. Untested steps are referenced by line numbers.
Flow Name Coverage Num Steps Missing Steps Line Numbers
flow_1 0.00% 1 1 [10-10]
flow_2 100.00% 4 0 []
Total 80.00% 5 1Additionally, we also create a histogram of command coverage showing how many and what commands are produced in your e2e tests.
To enable this feature, please set the environment variable
RASA_PRO_BETA_FINETUNING_RECIPE
totrue
.export RASA_PRO_BETA_FINETUNING_RECIPE=true
More information can be found on the documentation of the feature.
-
Create a self-hosted LLM client compatible with OpenAI format. Users can connect to their own self-hosted LLM server that is compatible with OpenAI format.
Sample basic usage:
llm:
provider: self-hosted
model: <deployment_name>
api_base: <deployment_url>
api_type: openai [Optional] -
Add a new CLI command
rasa llm finetune prepare-data
to create a dataset from e2e tests that can be used to fine-tune a base model for the task of command generation.To enable this feature, please set the environment variable
RASA_PRO_BETA_FINETUNING_RECIPE
totrue
.export RASA_PRO_BETA_FINETUNING_RECIPE=true
-
It is now allowed to link to
pattern_human_handoff
from any pattern and user flow. -
Allow links from all patterns to user flows except for
pattern_internal_error
. -
- LiteLLM Integration & Reduced LangChain Reliance:
- Introduced
LLMClient
andEmbeddingClient
protocols for standardized client interfaces. - Created lightweight client wrappers for LiteLLM to streamline model instantiation, management, and inference.
- Updated
llm_factory
andembedder_factory
to utilize these LiteLLM client wrappers. - Added dedicated clients for Azure OpenAI and OpenAI to support both LLMs and embedding models.
- Added a HuggingFace client to compute embeddings using locally stored transformer models via the
sentence-transformers
package.
- Introduced
- LangChain Update: Upgraded to the latest version (0.2.x) for improved compatibility and features. To understand the implications on your assistant, please refer to the feature documentation and the migration guide.
- LiteLLM Integration & Reduced LangChain Reliance:
-
Implement as part of E2E testing a new type of evaluation specifically designed to increase confidence in CALM. This evaluation runs assertions on the assistant's actual events and generative responses. New assertions include the ability to check for the presence of specific events, such as:
- flow started, flow completed or flow cancelled events
- whether
pattern_clarification
was triggered for specific flows - whether buttons rendered well as part of the bot uttered event
- whether slots were set correctly or not
- whether the bot text response matches a provided regex pattern
- whether the bot response matches a provided domain response name
These assertions can be specified for user steps only and cannot be used alongside the former E2E test format. You can learn more about this new feature in the documentation.
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_ASSERTIONS
totrue
.export RASA_PRO_BETA_E2E_ASSERTIONS=true
-
Configure LLM-as-Judge settings in the
llm_as_judge
section of theconftest.yml
file. These settings will be used to evaluate the groundedness and relevance of generated bot responses. Theconftest.yml
is discoverable as long as it is in the root directory of the assistant project, at the same level as theconfig.yml
file.If the
conftest.yml
file is not present in the root directory, the default LLM judge settings will be used. -
Implement automatic E2E test case conversion from sample conversation data.
This feature includes:
- A CLI command to convert sample conversation data (CSV, XLSX) into executable E2E test cases.
- Conversion of sample data using an LLM to generate YAML formatted test cases.
- Export of generated test cases into a specified YAML file.
Usage:
rasa data convert e2e <path>
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_CONVERSION
totrue
.export RASA_PRO_BETA_E2E_CONVERSION=true
For more details, please refer to this documentation page.
Improvements
-
Implemented custom action stubbing for E2E test cases. To define custom action stubs, add
stub_custom_actions
to the test case file.Stubs can be defined in two ways:
- Test file level: Define each action by its name (
action_name
). - Test case level: Define the stub using the test case ID as a prefix (
test_case_id::action_name
).
To learn more about this feature, please refer to the documentation.
To enable this feature, set the environment variable
RASA_PRO_BETA_STUB_CUSTOM_ACTION
totrue
:export RASA_PRO_BETA_STUB_CUSTOM_ACTION=true
- Test file level: Define each action by its name (
-
Add
max_messages_in_query
parameter to Enterprise Search Policy, it allows controlling the number of past messages that are used in the search query for retrieval -
Configure LLM E2E test converter settings in the
llm_e2e_test_conversion
section of theconftest.yml
file.These settings will be used to configure the LLM used to convert sample conversation data into E2E test cases.
The
conftest.yml
is discoverable as long as it is in the root directory of the tests output path.If the
conftest.yml
file is not present in the root directory, the default LLM settings will be used. -
Add the datetime of Rasa Pro license expiry to
rasa --version
command Add/license
API endpoint that also returns the same information -
Suppress LiteLLM info and debug log messages in the console.
-
Cache llm_factory and embedder_factory methods to avoid client instantiation and validation for every user utterance.
-
Added E2E Test Conversion Completed telemetry event with file type and test case count properties.
-
Separate writing of failed and passed e2e test results to distinct file paths.
-
Implement support for evaluating IntentlessPolicy responses with generative response assertions.
-
Use direct custom action execution in tutorial and CALM templates. Skip action server health check in e2e testing if direct custom action execution is configured.
-
Modified the type of flows which are included into the import CLI (previously only user flows were enabled, now patterns are included). Use case: This is needed for Studio 1.7, since that release is enabling modification and management of patterns inside Studio, and needs the ability to import patterns from yaml files.
-
Improve events and responses sub-schemas used by the
stub_custom_actions
sub-schema of end-to-end testing. The events sub-schema only allows the usage of events which are supported by therasa-sdk
. These are documented in the action server API documentation. -
Change default model of conversation rephraser to 'gpt-4o-mini'.
-
Add
file_path
toFlow
so that we can show the full name, e.g.path/to/flow.py::flow name
in the e2e test coverage report. -
Introduced remote storage to upload trained model to persistors(AWS, GCP, Azure)
-
Add ability to download training data from remote storage(gcs, aws, azure)
-
Allow saving models to and retrieving from sub folders in cloud storage.
-
Introduced
DirectCustomActionExecutor
for executing custom actions directly through the assistant.Introduced
actions_module
variable underaction_endpoint
inendpoints.yml
to explicitly specify the path to custom actions module.If
actions_module
is set, custom actions will be executed directly through the assistant. -
Add validation for the values against which categorical and boolean slots are checked in the if conditional steps. An error will be thrown when a slot is compared to an invalid/non-existent value for boolean and categorical slots.
-
Add user query and retrieved document results to the metadata of
action_send_text
predicted by EnterpriseSearchPolicy. In addition, add domain ground truth responses to theBotUttered
event metadata when rephrasing is enabled. These changes were required to allow evaluations of generative responses against the ground truth stored in the metadata ofBotUttered
events.
Bugfixes
-
Fix problem with custom action invocation when model is loaded from remote storage.
-
Ensure certificates for openai based clients.
-
Mark the first slot event as seen when the user turn in a E2E test case contains multiple slot events for the same slot. This fixes the issue when the
assertion_order_enabled
is set totrue
and the user step in a test case contained multipleslot_was_set
assertions for the same slot, the last slot event was marked as seen when the first assertion was running. This caused the test to fail for subsequentslot_was_set
assertions for the same slot with errorSlot <slot_name> was not set
. -
Validate the LLM configuration during training for the following components:
Contextual Response Rephraser
Enterprise Search Policy
Intentless Policy
LLM Based Command Generator
LLM Based Router
Additionally, update the
get_provider_from_config
method to retrieve the provider using both themodel
andmodel_name
configuration parameters. -
Fixes throwing the deprecation warning if the setting for Azure OpenAI Embedding Client was not set through the deprecated environment variable.
-
Fix execution of stub custom actions when they contain test case name and the separator in its provided stub name. Test runner will now correctly execute the correct stub implementation for the same custom action dependent on the test name.
-
Add validation to conversation rephraser.
-
Ensure YAML files with datetime-formatted strings are read as plain strings instead of being converted to datetime objects.
-
Deprecate 'request_timeout' for OpenAI and Azure OpenAI clients in favor of 'timeout'
-
Forbid
stream
andn
parameters for clients. Having these parameters withinllm
andembeddings
configuration will result in error. -
Raise deprecation warning if
api_type
is set tohuggingface
instead ofhuggingface_local
for HuggingFace local embeddings. -
Fix resolving aliases for deprecated keys when instantiating LLM and embedding clients.
-
Fix detection of conftest file which contained custom LLM judge configuration.
-
Fix issue with Rasa Pro Studio download command exporting default flows which had not been customized by the Studio user. Rasa Pro Studio download command only exports user defined flows, customized patterns and user defined domain locally from the Studio instance.
Similarly, fix issue with Rasa Pro Studio upload command importing default flows which had not been customized to Studio. Rasa Pro Studio upload command only imports user defined flows, customized patterns and user defined domain to the Studio instance.
-
Disable auto-inferring provider from the config. Ensure the provider is explicitly read from the
provider
key. -
Fix writing e2e test cases to disk.
slot_was_set
andslot_was_not_set
are now written down correctly. -
The rephraser of the
rasa llm finetune data-prepare
command now compares the original user message and the user message returned in the LLM output case-insensitive. -
[rasa llm finetune prepare-data] Do not rephrase user messages that come from a button payload.
-
Separate commands in the expected LLM output by newlines.
-
Fix TypeError in PatternClarificationContainsAssertion hash function by converting sets to lists for successful JSON serialization.
-
Fix validation in case a link to
pattern_human_handoff
is used. -
[
rasa llm finetune prepare-data
] Skip paraphrasing module in casenum-rephrases
is set to 0. -
Update the handling of incorrect use of slash syntax. Messages with undefined intents do not automatically trigger
pattern_cannot_handle
; instead, they are sanitized (prepended slash(es) are removed) and passed through the graph. -
Allow suitable patterns to be properly started using nlu triggers
-
Fix API connection error for bedrock embedding endpoint.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.17] - 2024-12-05
Rasa Pro 3.9.17 (2024-12-05)
Bugfixes
- Implement
eq
andhash
functions forChangeFlowCommand
to fixerror=unhashable type: 'ChangeFlowCommand'
error inMultiStepCommandGenerator
.
[3.9.16] - 2024-11-26
Rasa Pro 3.9.16 (2024-11-26)
Bugfixes
-
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
[3.9.15] - 2024-10-18
Rasa Pro 3.9.15 (2024-10-18)
Improvements
- Change default response of
utter_free_chitchat_response
from"placeholder_this_utterance_needs_the_rephraser"
to"Sorry, I'm not able to answer that right now."
.
Bugfixes
- Fix cleanup of
SetSlot
commands issued by the LLM-based command generator for slots that define a slot mapping other than thefrom_llm
slot mapping. The command processor now correctly removes the SetSlot command in these scenarios and instead adds aCannotHandleCommand
. - Disallow using the command payload syntax to set slots not filled by any of the active or startable flow(s)
collect
steps.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.14] - 2024-10-02
Rasa Pro 3.9.14 (2024-10-02)
No significant changes.
[3.9.13] - 2024-10-01
Rasa Pro 3.9.13 (2024-10-01)
Bugfixes
- Fix inconsistent recording of telemetry events for llm-based command generators.
- Added tracing explicitly to
GRPCCustomActionExecutor.run
in order to pass the tracing context to the action server. - Fixes an issue where the
CountVectorsFeaturizer
andLogisticRegressionClassifier
would throw error during inference when no NLU training data is provided.
[3.9.12] - 2024-09-20
Rasa Pro 3.9.12 (2024-09-20)
Deprecations and Removals
- Dropped support for Python 3.8 ahead of Python 3.8 End of Life in October 2024. In Rasa Pro versions 3.10.0, 3.9.11 and 3.8.13, we needed to pin the TensorFlow library version to 2.13.0rc1 in order to remove critical vulnerabilities; this resulted in poor user experience when installing these versions of Rasa Pro with
uv pip
. Removing support for Python 3.8 will make it possible to upgrade to a stabler version of TensorFlow.
Improvements
- Update Keras and Tensorflow to version 2.14.
This will eliminate the need to use the
--prerelease allow
flag when installing Rasa Pro usinguv pip
tool.
Bugfixes
- Fix
AttributeError
with the instrumentation of therun
method of theCustomActionExecutor
class. - Fixed UnexpecTEDIntentlessPolicy training errors that resulted from a change to batching behavior. Changed the batching behavior back to the original for all components. Made the changed batching behavior accessible in DietClassifier using
drop_small_last_batch: True
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.11] - 2024-09-13
Rasa Pro 3.9.11 (2024-09-13)
Bugfixes
- Update Keras to 2.13.1 and Tensorflow to 2.13.0rc0 to fix critical vulnerability (CVE-2024-3660).
[3.9.10] - 2024-09-12
Rasa Pro 3.9.10 (2024-09-12)
Bugfixes
- Fix
FileNotFound
error when runningrasa studio
commands and no pre-existing local assistant project exists. - Fixed telemetry collection for the components Rephraser, LLM Intent Classifier, Intentless Policy and Enterprise Search Policy to ensure that the telemetry data is only collected when it is enabled
[3.9.9] - 2024-08-23
Rasa Pro 3.9.9 (2024-08-23)
Bugfixes
-
Updated behaviour of policies in coexistence:
- CALM policies run in case the routing slot is set to
True
(routing to CALM). - Policies of the nlu-based system run in case the routing slot is set to
False
(routing to NLU-based system) orNone
(non-sticky routing).
- CALM policies run in case the routing slot is set to
-
Don't create an instance of
FlowRetrieval
in the command generators in case no flows exists. -
Patterns do not count as active flows in
MultiStepLLMCommandGenerator
anymore. -
Make sure that all e2e test cases in rasa inspector are valid.
-
Downloading of CALM Assistants from Studio improved:
- Downloading CALM assistants from Studio now includes
config
andendpoints
files - Downloading CALM assistants from Studio now doesn't require
config.yml
anddata
folder to exist
- Downloading CALM assistants from Studio now includes
[3.9.8] - 2024-08-21
Rasa Pro 3.9.8 (2024-08-21)
Bugfixes
- Fix problem with custom action invocation when model is loaded from remote storage.
[3.9.7] - 2024-08-15
Rasa Pro 3.9.7 (2024-08-15)
Bugfixes
- Fix extraction of tracing context from the request headers and injection into the Rasa server tracing context.
YamlValidationException
will correctly return line number of the element where the error occurred when line number of that element is not returned byruamel.yaml
(for elements of primitive types, e.g.str
,int
, etc.), instead of returning the line number of the parent element.- Updated
setuptools
to fix security vulnerability. - Fix tracing context propagation to work for all external service calls.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.6] - 2024-08-07
Rasa Pro 3.9.6 (2024-08-07)
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.5] - 2024-08-01
Rasa Pro 3.9.5 (2024-08-01)
Improvements
- Enabled generative chitchat in the
tutorial
template with instructions on how to turn it off added to the documentation.
Bugfixes
- Update the usage of
time.process_time_ns
withtime.perf_counter_ns
to fix the inconsistencies between duration metrics and trace spans duration.
[3.9.4] - 2024-07-25
Rasa Pro 3.9.4 (2024-07-25)
Bugfixes
- Fix instrumentation not accounting for
kwargs
that are passed toNLUCommandAdapter.predict_commands
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.3] - 2024-07-18
Rasa Pro 3.9.3 (2024-07-18)
Bugfixes
- Refactor the supported remote storage (AWS, GCS, Azure) verification check before downloading Rasa model by fixing the initial implementation which attempted to create the object storage to check existence.
- Fix
TypeError: InformationRetrieval.search() got an unexpected keyword argument
when tracing is enabled withEnterpriseSearchPolicy
. - Change
warning
log level toerror
log level forValidator
methods that verify that forms and actions used in stories and rules are present in the domain.
[3.9.2] - 2024-07-09
Rasa Pro 3.9.2 (2024-07-09)
Bugfixes
- Add key-word arguments in the predict_commands method of LLM-based CommandGenerator class to ensure custom components are not impacted by changes to the signature of the base classes.
[3.9.1] - 2024-07-04
Rasa Pro 3.9.1 (2024-07-04)
Bugfixes
- Modify the validation to throw an error for a missing associated action/utterance in a collect step only if the slot does not have a defined initial value.
- Modify the collect step validation in flow executor to trigger
pattern_internal_error
for a missing associated action/utterance in a collect step only if the slot does not have a defined initial value.
[3.9.0] - 2024-07-03
Rasa Pro 3.9.0 (2024-07-03)
Features
-
Introduce a new response button payload format that runs set slot CALM commands directly by skipping the user message processing pipeline.
-
Added support for Information Retrieval custom components. It allows Enterprise Search Policy to be used with arbitrary search systems. Custom Information Retrievals can be implemented as a subclass of
rasa.core.information_retrieval.InformationRetrieval
-
Enable slot filling in a CALM assistant to be configurable:
- either use NLU-based predefined slot mappings that instructs
NLUCommandAdapter
to issue SetSlot commands with values extracted from the user input via an entity extractor or intent classifier - or use the new predefined slot mapping
from_llm
which enables LLM-based command generators to issue SetSlot commands If no slot mapping is defined, the default behavior is to use thefrom_llm
slot mapping.
In case you had been using
custom
slot mapping type for slots set with the prediction of the LLM-based command generator, you need to update your assistant configuration to use the newfrom_llm
slot mapping type. Note that even if you have written custom slot validation actions (following thevalidate_<slot_name>
convention) for slots set by the LLM-based command generator, you need to update your assistant configuration to use the newfrom_llm
slot mapping type.For slots that are set only via the custom action e.g. slots set by external sources only, you need to add the action name to the slot mapping:
slots:
slot_name:
type: text
mappings:
- type: custom
action: custom_action_name - either use NLU-based predefined slot mappings that instructs
-
Skip
SetSlot
commands issued by LLM based command generators for slots with NLU-based predefined slot mappings. Instead, the command processor component will issueCannotHandle
command to triggerpattern_cannot_handle
if no other valid command is found. -
Rasa now supports gRPC protocol for custom actions. This allows users to use gRPC to invoke custom actions. To connect to the action server using gRPC, specify:
endpoints.ymlaction_endpoint:
url: "grpc://<rasa-grpc-action-server>:<port>"Users can use secure (TLS) and insecure connections to communicate over gRPC. To use TLS specify the following in
endpoints.yml
:endpoints.ymlaction_endpoint:
url: "grpc://<rasa-grpc-action-server>:<port>"
cafile: "<ca_file>" -
Add
MultiStepLLMCommandGenerator
as an alternative LLM based command generator.MultiStepLLMCommandGenerator
breaks down the task of dialogue understanding into two steps: handling the flows and filling the slots. The component was designed to enable cheaper and smaller LLMs, such asgpt-3.5-turbo
, as viable alternatives to costlier but more powerful models such asgpt-4
. To use theMultiStepLLMCommandGenerator
add it to your pipeline:pipeline:
...
- name: MultiStepLLMCommandGenerator
...
Improvements
-
Improve diagram display in the inspector by adding an horizontal scroll and an auto scroll to the active step.
-
Create a separate default prompt for Enterprise Search with source citation enabled and revert the default Enterprise Search prompt to that of
3.7.x
. -
Refactored
RemoteAction
to utilize a newCustomActionExecutor
interface by implementingHTTPCustomActionExecutor
to handle HTTP requests for custom actions. -
Implemented an optimization to reduce payload size by ensuring the Assistant sends the domain dictionary to the Action Server only once, which the server then stores. If the Action Server responds with a 449 status code indicating a missing domain context, the Assistant will repeat the API request including the domain dictionary in the payload, ensuring the server properly saves this data.
-
Integrate the capability of testing scenarios that reflect actual operational environments where conversations can be influenced by real-time external data. This is done by injecting metadata when running end-to-end tests.
-
Introduced LRU caching for reading and parsing YAML files to enhance performance by avoiding multiple reads of the same file. Added
READ_YAML_FILE_CACHE_MAXSIZE
environment variable with a default value of 256 to configure the cache size. -
Add validations for flow ID to allow only alphanumeric characters, underscores, and hyphens except for the first character.
-
The
LLMCommandGenerator
component has been renamed toSingleStepLLMCommandGenerator
. There is no change to the functionality.Using the
LLMCommandGenerator
as the name of the component results in a deprecation warning as it will be permanently renamed toSingleStepLLMCommandGenerator
in 4.0.0. Please modify the assistant’s configuration to use theSingleStepLLMCommandGenerator
instead of theLLMCommandGenerator
to avoid seeing the deprecation warning. -
Make improvements to
rasa data validate
that check if the usage of slot mappings in a CALM assistant is valid:- a slot cannot have both a
from_llm
mapping and either a nlu-predefined mapping or a custom slot mapping - a slot collected in a flow by a custom action has an associated
action_ask_
defined in the domain - a CALM assistant with slots that have nlu-based predefined mappings include
NLUCommandAdapter
in the config pipeline - a NLU-based assistant cannot have slots that have a
from_llm
mapping
- a slot cannot have both a
-
Modify post processing of commands - Clarify command with single option is converted into a StartFlow command.
-
Improve debug logging for predicate evaluation.
Bugfixes
- Properly handle projects where
rasa studio download
is run in a project with no NLU data. - Tracing is supported for actions called over gRPC protocol.
- Fix the hash function of ClarifyCommand to return a hashed list of options.
- Raise an error if action_reset_routing is used without the defined ROUTE_TO_CALM_SLOT / router.
- Add a few bugfixes to the CALM slot mappings feature:
- Coexistence bot should ignore
NoOpCommand
when checking if the processed message contains commands. - Update condition under which FlowPolicy triggers
pattern_internal_error
for slots with custom slot mappings.
- Coexistence bot should ignore
- Remove invalid warnings during collect step.
-
- Fixed issue where messages with invalid intent triggers ('/<intent>') were not handled correctly. Now triggering the
pattern_cannot_handle
. - Introduced a new reason
cannot_handle_invalid_intent
for use in the pattern_cannot_handle switch mechanism to improve error handling.
- Fixed issue where messages with invalid intent triggers ('/<intent>') were not handled correctly. Now triggering the
- Validates that a collect step in a flow either has an action or an utterance defined in the domain to avoid the bot being silent.
- Slots that are set via response buttons should not trigger
pattern_cannot_handle
regardless of the slots' mapping type. - Coerce "None", "null" or "undefined" slot values set via response buttons to be of type
NoneType
instead ofstr
. - Avoid raising a
UserWarning
during validation of response buttons which contain double curly braces. - Do not run NLUCommandAdapter during message parsing when receiving a
/SetSlots
button payload. This is because the NLUCommandAdapter run during message parsing (when the graph is skipped) is meant to handle intent button payloads only. - Exclude slots that are not collected in any flow from being set by the NLUCommandAdapter in a coexistence assistant.
- Default action
action_extract_slots
should not run custom actions specified in custom slot mappings for slots that are set by custom actions in the flows/CALM system of a coexistence assistant. - Fix pattern flows being unavailable during input preparation and template rendering in
MultiStepLLMCommandGenerator
. - Skip command cleaning when no commands are present in NLUCommandAdapter. Fix get active flows to return the correct active flows, including all the nested parent flows if present.
- If FlowPolicy tries to collect a slot with a custom slot mapping without the
action
key oraction_ask
specified in the domain, it will triggerpattern_cancel_flow
first, thenpattern_internal_error
. - Cancel user flow in progress and invoke pattern_internal_error if the flow reached a collect step which does not have an associated utter_ask response or action_ask action defined in the domain.
- IntentlessPolicy abstains from making a prediction during coexistence when it's the turn of the NLU-based system.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.8.17] - 2024-10-18
Rasa Pro 3.8.17 (2024-10-18)
Improvements
- Change default response of
utter_free_chitchat_response
from"placeholder_this_utterance_needs_the_rephraser"
to"Sorry, I'm not able to answer that right now."
.
[3.8.16] - 2024-10-02
Rasa Pro 3.8.16 (2024-10-02)
No significant changes.
[3.8.15] - 2024-10-01
Rasa Pro 3.8.15 (2024-10-01)
Bugfixes
- Fixes an issue where the
CountVectorsFeaturizer
andLogisticRegressionClassifier
would throw error during inference when no NLU training data is provided.
[3.8.14] - 2024-09-20
Rasa Pro 3.8.14 (2024-09-20)
Deprecations and Removals
- Dropped support for Python 3.8 ahead of Python 3.8 End of Life in October 2024. In Rasa Pro versions 3.10.0, 3.9.11 and 3.8.13, we needed to pin the TensorFlow library version to 2.13.0rc1 in order to remove critical vulnerabilities; this resulted in poor user experience when installing these versions of Rasa Pro with
uv pip
. Removing support for Python 3.8 will make it possible to upgrade to a stabler version of TensorFlow.
Improvements
- Update Keras and Tensorflow to version 2.14.
This will eliminate the need to use the
--prerelease allow
flag when installing Rasa Pro usinguv pip
tool.
Bugfixes
- Fixed UnexpecTEDIntentlessPolicy training errors that resulted from a change to batching behavior. Changed the batching behavior back to the original for all components. Made the changed batching behavior accessible in DietClassifier using
drop_small_last_batch: True
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.8.13] - 2024-09-12
Rasa Pro 3.8.13 (2024-09-12)
Bugfixes
- Fixed telemetry collection for the components Rephraser, LLM Intent Classifier, Intentless Policy and Enterprise Search Policy to ensure that the telemetry data is only collected when it is enabled
- Update Keras to 2.13.1 and Tensorflow to 2.13.0rc0 to fix critical vulnerability (CVE-2024-3660).
[3.8.12] - 2024-08-12
Rasa Pro 3.8.12 (2024-08-12)
Bugfixes
- Fix
TypeError: InformationRetrieval.search() got an unexpected keyword argument
when tracing is enabled withEnterpriseSearchPolicy
. - Fix extraction of tracing context from the request headers and injection into the Rasa server tracing context.
- Update the usage of
time.process_time_ns
withtime.perf_counter_ns
to fix the inconsistencies between duration metrics and trace spans duration. YamlValidationException
will correctly return line number of the element where the error occurred when line number of that element is not returned byruamel.yaml
(for elements of primitive types, e.g.str
,int
, etc.), instead of returning the line number of the parent element.- Updated
setuptools
to fix security vulnerability.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.8.11] - 2024-07-04
Rasa Pro 3.8.11 (2024-07-04)
Improvements
- Improve debug logging for predicate evaluation.
Bugfixes
- Raise an error if action_reset_routing is used without the defined ROUTE_TO_CALM_SLOT / router.
- Remove invalid warnings during collect step.
-
- Fixed issue where messages with invalid intent triggers ("/intent_name") were not handled correctly. Now triggering the
pattern_cannot_handle
. - Introduced a new reason
cannot_handle_invalid_intent
for use in the pattern_cannot_handle switch mechanism to improve error handling.
- Fixed issue where messages with invalid intent triggers ("/intent_name") were not handled correctly. Now triggering the
- Validates that a collect step in a flow either has an action or an utterance defined in the domain to avoid the bot being silent.
- Skip command cleaning when no commands are present in NLUCommandAdapter. Fix get active flows to return the correct active flows, including all the nested parent flows if present.
- Update the handling of incorrect use of slash syntax. Messages with undefined intents do not automatically trigger
pattern_cannot_handle
; instead, they are sanitized (prepended slash(es) are removed) and passed through the graph. - Modify the validation to throw an error for a missing associated action/utterance in a collect step only if the slot does not have a defined initial value.
[3.8.10] - 2024-06-19
Rasa Pro 3.8.10 (2024-06-19)
Improvements
- Added NLG validation to the rasa model training process.
Bugfixes
- Fixes Clarify command being dropped by command processor due to presence of coexistence slot -
route_session_to_calm
- Fix validation for LLMBasedRouter to check only for calm_entry.sticky
[3.8.9] - 2024-06-14
Rasa Pro 3.8.9 (2024-06-14)
Improvements
- Add validations for flow ID to allow only alphanumeric characters, underscores, and hyphens except for the first character.
[3.8.8] - 2024-06-07
Rasa Pro 3.8.8 (2024-06-07)
Bugfixes
- Add wrappers around openai clients that can set the self-signed certs via
REQUESTS_CA_BUNDLE
env variable.
[3.8.7] - 2024-05-29
Rasa Pro 3.8.7 (2024-05-29)
Bugfixes
- Add support for domain entities in CALM import
- Download NLU data when running
rasa studio download
for a modern assistant with NLU triggers. Previously, this data was not downloaded, leading to a partial assistant.
[3.8.6] - 2024-05-27
Rasa Pro 3.8.6 (2024-05-27)
Improvements
- Adds
tracker_state
attribute toOutputChannel
. It simplifies the access of tracker state for custom channel connector withCollectingOutputChannel.tracker_state
.
Bugfixes
- If a button in a response does not have a payload, socketio channel will use the title as payload by default rather than throwing an exception.
[3.8.5] - 2024-05-03
Rasa Pro 3.8.5 (2024-05-03)
Bugfixes
- Trigger
pattern_internal_error
if collection does not exist in a Qdrant vector store.
[3.8.4] - 2024-04-30
Rasa Pro 3.8.4 (2024-04-30)
Improvements
- Added support for NLU Triggers by supporting uploading the NLU files for CALM Assistants
[3.8.3] - 2024-04-26
Rasa Pro 3.8.3 (2024-04-26)
Improvements
-
- Throw validation error and exit when duplicate responses are found across domains. This is a breaking change, as it will cause training to fail if duplicate responses are found. If you have duplicate responses in your training data, you will need to remove them before training.
- Update domain importing to ignore the warnings about duplicates when merging with the default flow domain
Bugfixes
- Use AzureChatOpenAI class instead of AzureOpenAI class to instantiate openai models deployed in Azure. This fixes the usage of gpt-3.5-turbo model in Azure.
- Fixes validation to catch empty placeholders in response that dumps entire context.
- Fix security vulnerabilities by updating poetry environment: fonttools, CVE-2023-45139, from 4.40.0 to 4.43.0 aiohttp, CVE-2024-27306, from 3.9.3 to 3.9.4 dnspython, CVE-2023-29483, from 2.3.0 to 2.6.1 pymongo, CVE-2024-21506, from 4.3.3 to 4.6.3
- Numbers that are part of the body of the LLM answer in EnterpriseSearch should not be matched as citation references in the postprocessing method.
- Errors from the Flow Retrieval API are now both logged and thrown. When such errors occur, an ErrorCommand is emitted by the Command Generator.
[3.8.2] - 2024-04-25
Rasa Pro 3.8.2 (2024-04-25)
Bugfixes
- Add the currently active flow as well as the called flow (if present) to the
list of available flows for the
LLMCommandGenerator
. - Fix custom prompt not read from the model resource path for LLMCommandGenerator.
[3.8.1] - 2024-04-17
Rasa Pro 3.8.1 (2024-04-17)
Improvements
- Adjusted chat widget behavior to remain open when clicking outside the chat box area.
- Improve debug logs to include information about evaluation of
if-else
conditions in flows at runtime. - Remove the
ContextualResponseRephraser
from the tutorial template to keep it simple as it is not needed anymore. - Update poetry package manager version to
1.8.2
. Check the migration guide for instructions on how to update your environment.
Bugfixes
- Introduced support for numbered Markdown lists.
- Added support for uploading assistants with default domain directory.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.8.0] - 2024-04-03
Rasa Pro 3.8.0 (2024-04-03)
Features
-
Introduces semantic retrieval of flows at runtime to reduce the size of the prompt sent to the LLM by utilizing similarity between vector embeddings. It enables the assistant to scale to a large number of flows.
Flow retrieval is enabled by default. To configure it, you can modify the settings under the
flow_retrieval
property ofLLMCommandGenerator
component. For detailed configuration options, refer to our documentation.Introduces
always_include_in_prompt
field to the flow definition. If field is set totrue
and the flow guard defined in theif
field evaluates totrue
, the flow will be included in the prompt. -
Introduction of coexistence between CALM and NLU-based assistants. Coexistence allows you to use policies from both CALM and NLU-based assistants in a single assistant. This allows migrating from NLU-based paradigm to CALM in an iterative fashion.
-
Introduction of
call
step. You can use acall
step to embed another flow. When the execution reaches acall
step, Rasa starts the called flow. Once the called flow is complete, the execution continues with the calling flow.
Improvements
-
Instrument the
command_processor
module, in particular the following functions:execute_commands
clean_up_commands
validate_state_of_commands
remove_duplicated_set_slots
-
Improve the instrumentation of
LLMCommandGenerator
:- extract more LLM configuration parameters, e.g.
type
,temperature
,request-timeout
,engine
anddeployment
(the latter 2 being only for the Azure OpenAI service). - instrument the private method
_check_commands_against_startable_flows
to track the commands with which the LLM responded, as well as the startable flow ids.
- extract more LLM configuration parameters, e.g.
-
Instrument
flow_executor.py
module, in particular these functions:advance_flows()
: extractavailable_actions
tracing tagadvance_flows_until_next_action()
: extract action name and score, metadata and prediction events as tracing tags from the returned prediction valuerun_step()
: extract step custom id, description and current flow id.
-
Instrument
Policy._prediction()
method for each of the policy subclasses. -
Instrument
IntentlessPolicy
methods such as:find_closest_response
: extract theresponse
andscore
from the returned tuple;select_response_examples
: extract theai_response_examples
from returned value;select_few_shot_conversations
: extract theconversation_samples
from returned value;extract_ai_responses
: extract theai_responses
from returned value;generate_answer
: extract thellm_response
from returned value.
-
- Instrument
InformationRetrieval.search
method for supported vector stores: extract query and document metadata tracing attributes. - Instrument
EnterpriseSearchPolicy._generate_llm_answer
method: extract LLM config tracing attributes. - Extract dialogue stack current context in the following functions:
rasa.dialogue_understanding.processor.command_processor.clean_up_commands
rasa.core.policies.flows.flow_executor.advance_flows
rasa.core.policies.flows.flow_executor.run_step
- Instrument
-
- Instrument
NLUCommandAdapter.predict_commands
method and extract thecommands
from the returned value, as well as the user messageintent
. - Improve LLM config tracing attribute extraction for
ContextualResponseRephraser
.
- Instrument
-
Add new config boolean property
trace_prompt_tokens
that would enable the tracing of the length of the prompt tokens for the following components:LLMCommandGenerator
EnterpriseSearchPolicy
IntentlessPolicy
ContextualResponseRephraser
-
Enable execution of single E2E tests by including the test case name in the path to test cases, like so:
path/to/test_cases.yml::test_case_name
orpath/to/folder_containing_test_cases::test_case_name
. -
Implement
MetricInstrumentProvider
interface whose role is to:- register instruments during metrics configuration
- retrieve the appropriate instrument to record measurements in the relevant instrumentation code section
-
Enabled the setting of a minimum similarity score threshold for retrieved documents in Enterprise Search's
vector_store
with the addition of thethreshold
property. If no documents are retrieved, it triggers Pattern Cannot Handle. This feature is supported in Milvus and Qdrant vector stores. -
Record measurements for the following metrics in the instrumentation code:
- CPU usage of the
LLMCommandGenerator
- memory usage of
LLMCommandGenerator
- prompt token usage of
LLMCommandGenerator
- method call duration for LLM specific calls (in
LLMCommandGenerator
,EnterpriseSearchPolicy
,IntentlessPolicy
,ContextualResponseRephraser
) - rasa client request duration
- rasa client request body size
Instrument
EndpointConfig.request()
method call in order to measure the client request metrics. - CPU usage of the
-
Improvements around default behaviour of
ChitChatAnswerCommand()
:- The command processor will issue
CannotHandleCommand()
instead of theChitChatCommand()
whenpattern_chitchat
uses an action stepaction_trigger_chitchat
without theIntentlessPolicy
being configured. During training a warning is raised. - Changed the default pattern_chitchat to:
pattern_chitchat:
description: handle interactions with the user that are not task-oriented
name: pattern chitchat
steps:
- action: action_trigger_chitchat- Default rasa init template for CALM comes with
IntentlessPolicy
added to pipeline.
- The command processor will issue
-
Add support for OTLP Collector as metrics receiver which can forward metrics to the chosen metrics backend, e.g. Prometheus.
-
Enable document source citation for Enterprise Search knowledge answers by setting the boolean
citation_enabled: true
property in theconfig.yml
file:policies:
- name: EnterpriseSearchPolicy
citation_enabled: true -
Add telemetry events for flow retrieval and call step
-
Tighten python dependency constraints in
pyproject.toml
, hence reducing the installation time to around 20 minutes withpip
(and no caching enabled). -
Improved tracing clarity of the Contextual Response Rephraser by adding the
_create_history
method span, including its LLM configuration attributes. -
Users now have enhanced control over the debugging process of LLM-driven components. This update introduces a fine-grained, customizable logging that can be controlled through specific environment variables.
For example, set the
LOG_LEVEL_LLM
environment variable to enable detailed logging at the desired level for all the LLM components or specify the component you are debugging:Example configuration
export LOG_LEVEL_LLM=DEBUG
export LOG_LEVEL_LLM_COMMAND_GENERATOR=INFO
export LOG_LEVEL_LLM_ENTERPRISE_SEARCH=INFO
export LOG_LEVEL_LLM_INTENTLESS_POLICY=DEBUG
export LOG_LEVEL_LLM_REPHRASER=DEBUG -
If the user wants to chat with the assistant at the end of
rasa init
, we are now callingrasa inspect
instead ofrasa shell
. -
A slot can now be collected via an action
action_ask_<slot-name>
instead of the utteranceutter_ask_<slot-name>
in a collect step. You can either define an utterance or an action for the collect step in your flow. Make sure to add your custom actionaction_ask_<slot-name>
to the domain file. -
Validate the configuration of the coexistence router before the actual training starts.
-
Improved error handling in Enterprise Search Policy, changed the prompt to improve formatting of documents and ensured empty slots are not added to the prompt.
-
Implement asynchronous graph execution. CALM assistants rely on a lot of I/O calls (e.g. to a LLM service), which impaired performances. With this change, we've improved the response time performance by 10x. All policies and components now support async calling.
-
Merge
rasa
andrasa-plus
packages into one. As a result, we renamed the Python package torasa-pro
and the Docker image torasa-pro
. Please head over to the migration guide here for installation, and here for the necessary configuration updates.