notice

This is documentation for Rasa Open Source Documentation v2.0.x, which is no longer actively maintained.
For up-to-date documentation, see the latest version (2.1.x).

Version: 2.0.x

rasa.core.test

WrongPredictionException Objects

class WrongPredictionException(RasaException, ValueError)

Raised if a wrong prediction is encountered.

EvaluationStore Objects

class EvaluationStore()

Class storing action, intent and entity predictions and targets.

add_to_store

| add_to_store(action_predictions: Optional[List[Text]] = None, action_targets: Optional[List[Text]] = None, intent_predictions: Optional[List[Text]] = None, intent_targets: Optional[List[Text]] = None, entity_predictions: Optional[List[Dict[Text, Any]]] = None, entity_targets: Optional[List[Dict[Text, Any]]] = None) -> None

Add items or lists of items to the store

merge_store

| merge_store(other: "EvaluationStore") -> None

Add the contents of other to self

serialise

| serialise() -> Tuple[List[Text], List[Text]]

Turn targets and predictions to lists of equal size for sklearn.

WronglyPredictedAction Objects

class WronglyPredictedAction(ActionExecuted)

The model predicted the wrong action.

Mostly used to mark wrong predictions and be able to dump them as stories.

inline_comment

| inline_comment() -> Text

A comment attached to this event. Used during dumping.

EndToEndUserUtterance Objects

class EndToEndUserUtterance(UserUttered)

End-to-end user utterance.

Mostly used to print the full end-to-end user message in the failed_test_stories.yml output file.

WronglyClassifiedUserUtterance Objects

class WronglyClassifiedUserUtterance(UserUttered)

The NLU model predicted the wrong user utterance.

Mostly used to mark wrong predictions and be able to dump them as stories.

inline_comment

| inline_comment() -> Text

A comment attached to this event. Used during dumping.

emulate_loop_rejection

emulate_loop_rejection(partial_tracker: DialogueStateTracker) -> None

Add ActionExecutionRejected event to the tracker.

During evaluation, we don't run action server, therefore in order to correctly test unhappy paths of the loops, we need to emulate loop rejection.

Arguments:

  • partial_tracker - a :class:rasa.core.trackers.DialogueStateTracker

test

async test(stories: Text, agent: "Agent", max_stories: Optional[int] = None, out_directory: Optional[Text] = None, fail_on_prediction_errors: bool = False, e2e: bool = False, disable_plotting: bool = False, successes: bool = False, errors: bool = True) -> Dict[Text, Any]

Run the evaluation of the stories, optionally plot the results.

Arguments:

  • stories - the stories to evaluate on
  • agent - the agent
  • max_stories - maximum number of stories to consider
  • out_directory - path to directory to results to
  • fail_on_prediction_errors - boolean indicating whether to fail on prediction errors or not
  • e2e - boolean indicating whether to use end to end evaluation or not
  • disable_plotting - boolean indicating whether to disable plotting or not
  • successes - boolean indicating whether to write down successful predictions or not
  • errors - boolean indicating whether to write down incorrect predictions or not

Returns:

Evaluation summary.

compare_models_in_dir

async compare_models_in_dir(model_dir: Text, stories_file: Text, output: Text) -> None

Evaluate multiple trained models in a directory on a test set.

Arguments:

  • model_dir - path to directory that contains the models to evaluate
  • stories_file - path to the story file
  • output - output directory to store results to

compare_models

async compare_models(models: List[Text], stories_file: Text, output: Text) -> None

Evaluate provided trained models on a test set.

Arguments:

  • models - list of trained model paths
  • stories_file - path to the story file
  • output - output directory to store results to