notice

This is documentation for Rasa Documentation v2.x, which is no longer actively maintained.
For up-to-date documentation, see the latest version (3.x).

Version: 2.x

rasa.nlu.training_data.training_data

TrainingData Objects

class TrainingData()

Holds loaded intent and entity training data.

merge

 | merge(*others: "TrainingData") -> "TrainingData"

Return merged instance of this data with other training data.

filter_training_examples

 | filter_training_examples(condition: Callable[[Message], bool]) -> "TrainingData"

Filter training examples.

Arguments:

condition - A function that will be applied to filter training examples.

Returns:

TrainingData - A TrainingData with filtered training examples.

sanitize_examples

 | @staticmethod
 | sanitize_examples(examples: List[Message]) -> List[Message]

Makes sure the training data is clean.

Remove trailing whitespaces from intent and response annotations and drop duplicate examples.

intents

 | @lazy_property
 | intents() -> Set[Text]

Returns the set of intents in the training data.

retrieval_intents

 | @lazy_property
 | retrieval_intents() -> Set[Text]

Returns the total number of response types in the training data

number_of_examples_per_intent

 | @lazy_property
 | number_of_examples_per_intent() -> Dict[Text, int]

Calculates the number of examples per intent.

number_of_examples_per_response

 | @lazy_property
 | number_of_examples_per_response() -> Dict[Text, int]

Calculates the number of examples per response.

entities

 | @lazy_property
 | entities() -> Set[Text]

Returns the set of entity types in the training data.

entity_roles

 | @lazy_property
 | entity_roles() -> Set[Text]

Returns the set of entity roles in the training data.

entity_groups

 | @lazy_property
 | entity_groups() -> Set[Text]

Returns the set of entity groups in the training data.

number_of_examples_per_entity

 | @lazy_property
 | number_of_examples_per_entity() -> Dict[Text, int]

Calculates the number of examples per entity.

sort_regex_features

 | sort_regex_features() -> None

Sorts regex features lexicographically by name+pattern

nlu_as_json

 | nlu_as_json(**kwargs: Any) -> Text

Represent this set of training examples as json.

nlg_as_markdown

 | nlg_as_markdown() -> Text

Generates the markdown representation of the response phrases (NLG) of TrainingData.

nlg_as_yaml

 | nlg_as_yaml() -> Text

Generates yaml representation of the response phrases (NLG) of TrainingData.

Returns:

responses in yaml format as a string

nlu_as_markdown

 | nlu_as_markdown() -> Text

Generates the markdown representation of the NLU part of TrainingData.

persist

 | persist(dir_name: Text, filename: Text = DEFAULT_TRAINING_DATA_OUTPUT_PATH) -> Dict[Text, Any]

Persists this training data to disk and returns necessary information to load it again.

sorted_entities

 | sorted_entities() -> List[Any]

Extract all entities from examples and sorts them by entity type.

sorted_intent_examples

 | sorted_intent_examples() -> List[Message]

Sorts the intent examples by the name of the intent and then response

validate

 | validate() -> None

Ensures that the loaded training data is valid.

Checks that the data has a minimum of certain training examples.

train_test_split

 | train_test_split(train_frac: float = 0.8, random_seed: Optional[int] = None) -> Tuple["TrainingData", "TrainingData"]

Split into a training and test dataset, preserving the fraction of examples per intent.

split_nlu_examples

 | split_nlu_examples(train_frac: float, random_seed: Optional[int] = None) -> Tuple[list, list]

Split the training data into a train and test set.

Arguments:

train_frac - percentage of examples to add to the training set.
random_seed - random seed

Returns:

Test and training examples.

is_empty

 | is_empty() -> bool

Checks if any training data was loaded.

without_empty_e2e_examples

 | without_empty_e2e_examples() -> "TrainingData"

Removes training data examples from intent labels and action names which were added for end-to-end training.

Returns:

Itself but without training examples which don't have a text or intent.

TrainingData Objects#

merge#

filter_training_examples#

sanitize_examples#

intents#

retrieval_intents#

number_of_examples_per_intent#

number_of_examples_per_response#

entities#

entity_roles#

entity_groups#

number_of_examples_per_entity#

sort_regex_features#

nlu_as_json#

nlg_as_markdown#

nlg_as_yaml#

nlu_as_markdown#

persist#

sorted_entities#

sorted_intent_examples#

validate#

train_test_split#

split_nlu_examples#

is_empty#

without_empty_e2e_examples#

TrainingData Objects

merge

filter_training_examples

sanitize_examples

intents

retrieval_intents

number_of_examples_per_intent

number_of_examples_per_response

entities

entity_roles

entity_groups

number_of_examples_per_entity

sort_regex_features

nlu_as_json

nlg_as_markdown

nlg_as_yaml

nlu_as_markdown

persist

sorted_entities

sorted_intent_examples

validate

train_test_split

split_nlu_examples

is_empty

without_empty_e2e_examples