Assertions Fundamentals
Learn about the fundamentals of end-to-end testing with assertions in Rasa Pro.
Assertion Types
Assertions in software engineering are checks that are made to ensure that the system or unit behaves as expected. This section provides comprehensive examples for each assertion type that you can use in your Rasa Pro end-to-end tests.
Flow Started Assertion
The flow_started
assertion checks if the flow with the provided id was started.
Flow Completed Assertion
The flow_completed
assertion checks if the flow with the provided id was completed.
Additionally, you can specify the expected final step id in the assertion.
Note that the flow step id should be customized and provided in the flow definition in order to use this assertion property.
Flow Cancelled Assertion
The flow_cancelled
assertion checks if the flow with the provided id was cancelled.
Additionally, you can specify the expected final step id in the assertion.
Note that the flow step id should be customized and provided in the flow definition in order to use this assertion property.
Pattern Clarification Contains Assertion
The pattern_clarification_contains
assertion checks if the clarification repair pattern was triggered and
returned the expected flow names. This assertion must list all flow names that are expected to be returned
as part of the clarification repair pattern.
Slot Was Set Assertion
The slot_was_set
assertion checks if the slot(s) with the provided name were filled with the provided value.
The value type should match the slot type defined in the domain file.
For example, boolean, float or integer values should not be quoted.
Slot Was Not Set Assertion
The slot_was_not_set
assertion checks if the slot with the provided name was not filled.
If the value is provided, the assertion checks if the slot was not filled with that specific value.
Note that when only name
is provided, the assertion checks that the slot was not filled with any value other than None
,
with the assumption made that for most slots, None
is the default initial value.
Action Executed Assertion
The action_executed
assertion checks if the action with the provided name was executed.
Bot Uttered Assertion
The bot_uttered
assertion checks if the bot utterance matches the provided pattern, buttons
and/or domain response name.
The text_matches
key is used to check if the bot utterance matches the provided pattern, which can be a string or a regular expression.
Note that when asserting buttons, they must be listed in the same order as they were defined in the domain file or custom action code.
Bot Did Not Utter Assertion
The bot_did_not_utter
assertion checks that the bot's utterance does not match the provided pattern,
buttons, and/or domain response name.
The text_matches
key is used to verify that the bot's utterance does not contain the provided pattern, which can be a string or a regular expression.
Generative Response Is Relevant Assertion
The generative_response_is_relevant
assertion checks if the generative response is relevant to the provided user input.
Setting a threshold value between 0 and 1 is required to determine the relevance of the generative response.
The LLM Judge model will score the generative response output on a scale from 1 to 5 where 1 is the least
relevant and 5 is the most relevant. The score is then mapped to a float value between 0 and 1
that can be compared to the threshold. The mapping is as follows: 1 -> 0.2, 2 -> 0.4, 3 -> 0.6, 4 -> 0.8, 5 -> 1.0.
In addition, you can also provide the domain response name to the
utter_name
key if you want to check if the rephrased response is relevant to the provided user input.
Generative Response Is Grounded Assertion
The generative_response_is_grounded
assertion checks if the generative response is factually accurate
in relation to the ground truth. Setting a threshold value between 0 and 1 is required to determine the
factual accuracy of the generative response. The LLM Judge model will score the generative response output
on a scale from 1 to 5 where 1 is the least factual accurate and 5 is the most factual accurate.
The score is then mapped to a float value between 0 and 1 that can be compared to the threshold.
The mapping is as follows: 1 -> 0.2, 2 -> 0.4, 3 -> 0.6, 4 -> 0.8, 5 -> 1.0.
The ground truth input can be either provided directly in the test or can be extracted by the test runner from the actual bot utterance event metadata, where the vector store search results (in the case of Enterprise Search) or the initial domain response (in the case of rephrased answers) are stored.
In addition, you can also provide the domain response name to the
utter_name
key which is used to filter for the correct bot utterance event.