Version: Latest

How to Use End-to-End Testing with Assertions

Learn how to use end-to-end testing with assertions to test your assistant's behaviour.

Test Format Overview

Assertions can be defined in the existing test case format as part of the user step. You can define multiple assertions for each user step. Here is an example of a test case with assertions:

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight"
assertions:
- bot_uttered:
utter_name: "utter_ask_destination"
- user: "New York"
assertions:
- slot_was_set:
- name: "destination"
value: "New York"
- bot_uttered:
utter_name: "utter_ask_origin"
- user: "San Francisco"
assertions:
- slot_was_set:
- name: "origin"
value: "San Francisco"
- bot_uttered:
text_matches: "When would you like to travel?"
warning

Note that you can only run a test case either using pre-existing step types such as bot or utter or using assertions.

Once a test case contains an assertion, the test runner will ignore the pre-existing step types.

By default, only the subset of actual Rasa events generated at each user turn will be used to verify the assertions. If additionally you want to verify that the order of your assertions is correct, you can set the assertion_order_enabled: true key in the user step. This will ensure that the assertions are verified in the order they are defined in the test case. For example:

test_cases:
- test_case: flight_booking
steps:
- user: "I want to book a flight"
assertions:
- bot_uttered:
utter_name: "utter_ask_destination"
- user: "New York"
assertion_order_enabled: true
assertions:
- slot_was_set:
- name: "destination"
value: "New York"
- bot_uttered:
utter_name: "utter_ask_origin"

How To Handle Duplicate User Text Messages In The Same Test Case

In some cases, you may want to send the same user text message multiple times in the same test case. In order to retrieve the actual events for the correct user turn, you must define and provide metadata to each of the user steps. The metadata must be unique for each user step, even if the user text message is the same.

For example:

metadata:
- duplicate_message_1:
turn_idx: 1
- duplicate_message_2:
turn_idx: 2
test_cases:
- test_case: flight_booking
steps:
... # other user steps
- user: "yes"
metadata: duplicate_message_1
assertions:
- bot_uttered:
utter_name: "utter_ask_confirmation_booking"
- user: "yes"
metadata: duplicate_message_2
assertions:
... # other assertions

Test Results Analysis

When running the test cases with assertions, the test runner will provide a summary stats of the accuracy per assertion type for that specific test run. The accuracy is calculated as the number of successful assertions divided by the sum of successful and failed assertions. Note that assertions that could not be run once a prior assertion in the test case failed are not included in the accuracy calculation.

A visual representation of the accuracy by assertion type in the test result shell output.

In addition, the test runner will provide a detailed report of the failed assertions, including the assertion itself, the error message, the line number in the test case file, and the transcript of the string representation of the actual events that were recorded in the dialogue up until the failed assertion.