rasa.shared.nlu.training_data.training_data
TrainingData Objects
Holds loaded intent and entity training data.
fingerprint
Fingerprint the training data.
Returns:
hex string as a fingerprint of the training data.
label_fingerprint
Fingerprints the labels in the training data.
Returns:
hex string as a fingerprint of the training data labels.
merge
Return merged instance of this data with other training data.
Arguments:
others
- other training data instances to merge this one with
Returns:
Merged training data object. Merging is not done in place, this will be a new instance.
filter_training_examples
Filter training examples.
Arguments:
condition
- A function that will be applied to filter training examples.
Returns:
TrainingData
- A TrainingData with filtered training examples.
__hash__
Calculate hash for the training data object.
Returns:
Hash of the training data object.
sanitize_examples
Makes sure the training data is clean.
Remove trailing whitespaces from intent and response annotations and drop duplicate examples.
nlu_examples
Return examples which have come from NLU training data.
E.g. If the example came from a story or domain it is not included.
Returns:
List of NLU training examples.
intent_examples
Returns the list of examples that have intent.
response_examples
Returns the list of examples that have response.
entity_examples
Returns the list of examples that have entities.
intents
Returns the set of intents in the training data.
action_names
Returns the set of action names in the training data.
retrieval_intents
Returns the total number of response types in the training data.
number_of_examples_per_intent
Calculates the number of examples per intent.
number_of_examples_per_response
Calculates the number of examples per response.
entities
Returns the set of entity types in the training data.
entity_roles
Returns the set of entity roles in the training data.
entity_groups
Returns the set of entity groups in the training data.
entity_roles_groups_used
Checks if any entity roles or groups are used in the training data.
number_of_examples_per_entity
Calculates the number of examples per entity.
sort_regex_features
Sorts regex features lexicographically by name+pattern
nlu_as_json
Represent this set of training examples as json.
nlg_as_yaml
Generates yaml representation of the response phrases (NLG) of TrainingData.
Returns:
responses in yaml format as a string
nlu_as_yaml
Generates YAML representation of NLU of TrainingData.
Returns:
data in YAML format as a string
persist_nlu
Saves NLU to a file.
persist_nlg
Saves NLG to a file.
get_nlg_persist_filename
Returns the full filename to persist NLG data.
persist
Persists this training data to disk and returns necessary information to load it again.
sorted_entities
Extract all entities from examples and sorts them by entity type.
validate
Ensures that the loaded training data is valid.
Checks that the data has a minimum of certain training examples.
train_test_split
Split into a training and test dataset, preserving the fraction of examples per intent.
split_nlu_examples
Split the training data into a train and test set.
Arguments:
train_frac
- percentage of examples to add to the training set.random_seed
- random seed used to shuffle examples.
Returns:
Test and training examples.
is_empty
Checks if any training data was loaded.
contains_no_pure_nlu_data
Checks if any NLU training data was loaded.
has_e2e_examples
Checks if there are any training examples from e2e stories.
list_to_str
Converts list to a string.
Arguments:
lst
- The list to convert.delim
- The delimiter that is used to separate list inputs.quote
- The quote that is used to wrap list inputs.
Returns:
The string.