Version: Latest

Assertions Fundamentals

Learn about the fundamentals of end-to-end testing with assertions in Rasa Pro.

Assertion Types

Assertions in software engineering are checks that are made to ensure that the system or unit behaves as expected. This section provides comprehensive examples for each assertion type that you can use in your Rasa Pro end-to-end tests.

Flow Started Assertion

The flow_started assertion checks if the flow with the provided id was started.

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight"
assertions:
- flow_started: "flight_booking"

Flow Completed Assertion

The flow_completed assertion checks if the flow with the provided id was completed. Additionally, you can specify the expected final step id in the assertion. Note that the flow step id should be customized and provided in the flow definition in order to use this assertion property.

test_cases:
- test_case: flight_booking
steps:
- user: "What is the average cost of a flight from New York to San Francisco?"
assertions:
- flow_completed:
flow_id: "pattern_search"
flow_step_id: "action_trigger_search"

Flow Cancelled Assertion

The flow_cancelled assertion checks if the flow with the provided id was cancelled. Additionally, you can specify the expected final step id in the assertion. Note that the flow step id should be customized and provided in the flow definition in order to use this assertion property.

test_cases:
- test_case: flight_booking
steps:
... # other user steps
- user: "Wait, I changed my mind, I don't want to book a flight."
assertions:
- flow_cancelled:
flow_id: "flight_booking"
flow_step_id: "make_payment"

Pattern Clarification Contains Assertion

The pattern_clarification_contains assertion checks if the clarification repair pattern was triggered and returned the expected flow names. This assertion must list all flow names that are expected to be returned as part of the clarification repair pattern.

test_cases:
- test_case: flight_booking
steps:
- user: "make booking"
assertions:
- pattern_clarification_contains:
- "flight booking"
- - "hotel booking"

Slot Was Set Assertion

The slot_was_set assertion checks if the slot(s) with the provided name were filled with the provided value. The value type should match the slot type defined in the domain file. For example, boolean, float or integer values should not be quoted.

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight from New York to San Francisco"
assertions:
- slot_was_set:
- name: "origin"
value: "New York"
- name: "destination"
value: "San Francisco"

Slot Was Not Set Assertion

The slot_was_not_set assertion checks if the slot with the provided name was not filled. If the value is provided, the assertion checks if the slot was not filled with that specific value.

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight to San Francisco."
assertions:
- slot_was_not_set:
- name: "origin"
- slot_was_not_set:
- name: "destination"
value: "New York"

Note that when only name is provided, the assertion checks that the slot was not filled with any value other than None, with the assumption made that for most slots, None is the default initial value.

Action Executed Assertion

The action_executed assertion checks if the action with the provided name was executed.

test_cases:
- test_case: flight_booking
steps:
- user: "Book me a flight from New York to San Francisco tomorrow first thing in the morning."
assertions:
- action_executed: "action_book_flight"

Bot Uttered Assertion

The bot_uttered assertion checks if the bot utterance matches the provided pattern, buttons and/or domain response name. The text_matches key is used to check if the bot utterance matches the provided pattern, which can be a string or a regular expression.

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight"
assertions:
- bot_uttered:
utter_name: utter_ask_destination
text_matches: "Where would you like to fly to?"
buttons:
- title: "New York"
payload: "/SetSlots(destination=New York)"
- title: "San Francisco"
payload: "/SetSlots(destination=San Francisco)"

Note that when asserting buttons, they must be listed in the same order as they were defined in the domain file or custom action code.

Bot Did Not Utter Assertion

The bot_did_not_utter assertion checks that the bot's utterance does not match the provided pattern, buttons, and/or domain response name. The text_matches key is used to verify that the bot's utterance does not contain the provided pattern, which can be a string or a regular expression.

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight"
assertions:
- bot_did_not_utter:
utter_name: utter_ask_payment
text_matches: "How would you like to pay?"
buttons:
- title: "Credit Card"
payload: "/set_payment_method{'method': 'credit_card'}"
- title: "PayPal"
payload: "/set_payment_method{'method': 'paypal'}"

Generative Response Is Relevant Assertion

The generative_response_is_relevant assertion checks if the generative response is relevant to the provided user input. Setting a threshold value between 0 and 1 is required to determine the relevance of the generative response. The LLM Judge model will score the generative response output on a scale from 1 to 5 where 1 is the least relevant and 5 is the most relevant. The score is then mapped to a float value between 0 and 1 that can be compared to the threshold. The mapping is as follows: 1 -> 0.2, 2 -> 0.4, 3 -> 0.6, 4 -> 0.8, 5 -> 1.0.

test_cases:
- test_case: flight_booking
steps:
- user: "What times are the flights from New York to San Francisco tomorrow?"
assertions:
- generative_response_is_relevant:
threshold: 0.90

In addition, you can also provide the domain response name to the utter_name key if you want to check if the rephrased response is relevant to the provided user input.

test_cases:
- test_case: flight_booking
steps:
... # other user steps
- user: "Actually, I want to amend flight date to next week."
assertions:
- generative_response_is_relevant:
threshold: 0.90
utter_name: utter_ask_correction_confirmation

Generative Response Is Grounded Assertion

The generative_response_is_grounded assertion checks if the generative response is factually accurate in relation to the ground truth. Setting a threshold value between 0 and 1 is required to determine the factual accuracy of the generative response. The LLM Judge model will score the generative response output on a scale from 1 to 5 where 1 is the least factual accurate and 5 is the most factual accurate. The score is then mapped to a float value between 0 and 1 that can be compared to the threshold. The mapping is as follows: 1 -> 0.2, 2 -> 0.4, 3 -> 0.6, 4 -> 0.8, 5 -> 1.0.

The ground truth input can be either provided directly in the test or can be extracted by the test runner from the actual bot utterance event metadata, where the vector store search results (in the case of Enterprise Search) or the initial domain response (in the case of rephrased answers) are stored.

test_cases:
- test_case: flight_booking
steps:
- user: "What is the average cost of a flight from New York to San Francisco?"
assertions:
- generative_response_is_grounded:
threshold: 0.90
ground_truth: "The average cost of a flight from New York to San Francisco is $500."

In addition, you can also provide the domain response name to the utter_name key which is used to filter for the correct bot utterance event.