End-to-End Testing with Assertions
Rasa Pro provides a way to test your assistant end-to-end with assertions.
New Beta Feature in 3.10
To gain confidence in the CALM system that powers Rasa Pro assistants before deploying them to production, Rasa Pro 3.10 introduces a new way to test your assistant end-to-end with multiple assertions on every user step turn. This feature is only a beta (experimental) and may change in future Rasa Pro versions.
Overview
End-to-end testing is a critical part of the development process for any conversational AI assistant. Rasa Pro provides a way to test your assistant end-to-end with assertions as a way to validate the behaviour of your assistant at each step of the conversation.
Among the assertions you can use are:
- Flow Started Assertion: Check if the flow with the provided id was started.
- Flow Completed Assertion: Check if the flow with the provided id was completed.
- Flow Cancelled Assertion: Check if the flow with the provided id was cancelled.
- Pattern Clarification Assertion: Check if the pattern clarification was triggered with the provided flow names.
- Slot Was Set Assertion: Check if the slot with the provided name was filled with the provided value.
- Slot Was Not Set Assertion: Check if the slot with the provided name was not filled.
- Action Executed Assertion: Check if the action with the provided name was executed.
- Bot Uttered Assertion: Check if the bot utterance matches the provided pattern, buttons and/or domain response name.
- Generative Response Is Relevant Assertion: Check if the generative response is relevant to the provided input.
- Generative Response Is Grounded Assertion: Check if the generative response is factually accurate in relation to the provided or extracted ground truth text input.
These assertions are checked against the actual events that Rasa Pro emits while running the test cases with a Rasa Pro assistant.
To get started with end-to-end testing with assertions, please head over to the Installation Prerequisites and How to Use End-to-End Testing with Assertions guides.
The new feature is released as a beta version, and we would love to hear your feedback, particularly on the usability and the value it brings to your testing workflow. We also have a list of questions that we would like to get feedback on. Please reach out to us through the Customer Support team to share your feedback.
Benefits of End-to-End Testing with Assertions
- Increased Confidence: End-to-end testing with assertions helps you gain confidence in the behaviour of CALM as it handles NLU understanding, flow logic advancement and enterprise search.
- Faster Feedback Loop: You can quickly identify issues in your assistant and fix them early in the development process.
- Improved Quality: By testing your assistant end-to-end with assertions, you can ensure that your assistant behaves as expected in different scenarios.
- Reduced Manual Testing: End-to-end testing with assertions helps you reduce the need for manual testing, allowing you to focus on more complex scenarios.
Beta Feedback Questions
Among the questions we would like to get feedback on are:
- How valuable did you find the assertions in your testing workflow?
- Do you see any risks with running the test in 2 exclusive modes - assertions OR prior test case format?
- How did the assertions feature help you identify issues in your assistant?
- How can we improve the usability of the assertions feature? Is the syntax clear and easy to use?
- What challenges did you face when using the assertions feature?
- What additional assertions would you like to see in the future?
- What are your most common assistant scenarios being tested which are currently not covered by the assertions?
- Would asserting commands help you in your testing workflow?
- How often did you have to update your assertions when the assistant flow implementation changed compared to when using the initial e2e test format?
- Is the assertion order configurability useful for your testing workflow?
- If you're defining slots with initial value in the domain file, have you found the
slot_was_not_set
initial value assumption to beNone
as a limitation in your testing workflow? - Is the current solution for generative response evaluation sufficient to decide if the response ‘passes’ the test?