notice

This is documentation for Rasa Open Source Documentation v2.3.x, which is no longer actively maintained.
For up-to-date documentation, see the latest version (2.5.x).

Version: 2.3.x

rasa.utils.tensorflow.model_data

FeatureArray Objects

class FeatureArray(np.ndarray)

Stores any kind of features ready to be used by a RasaModel.

Next to the input numpy array of features, it also received the number of dimensions of the features. As our features can have 1 to 4 dimensions we might have different number of numpy arrays stacked. The number of dimensions helps us to figure out how to handle this particular feature array. Also, it is automatically determined whether the feature array is sparse or not and the number of units is determined as well.

Subclassing np.array: https://numpy.org/doc/stable/user/basics.subclassing.html

__new__

| __new__(cls, input_array: np.ndarray, number_of_dimensions: int) -> "FeatureArray"

Create and return a new object. See help(type) for accurate signature.

__init__

| __init__(input_array: Any, number_of_dimensions: int, **kwargs)

Initialize. FeatureArray.

Needed in order to avoid 'Invalid keyword argument number_of_dimensions to function FeatureArray.init '

Arguments:

  • input_array - the array that contains features
  • number_of_dimensions - number of dimensions in input_array

__array_finalize__

| __array_finalize__(obj: Any) -> None

This method is called whenever the system internally allocates a new array from obj.

Arguments:

  • obj - A subclass (subtype) of ndarray.

__array_ufunc__

| __array_ufunc__(ufunc: Any, method: Text, *inputs, **kwargs) -> Any

Overwrite this method as we are subclassing numpy array.

Arguments:

  • ufunc - The ufunc object that was called.
  • method - A string indicating which Ufunc method was called (one of "call", "reduce", "reduceat", "accumulate", "outer", "inner").
  • *inputs - A tuple of the input arguments to the ufunc.
  • **kwargs - Any additional arguments

Returns:

The result of the operation.

__reduce__

| __reduce__() -> Tuple[Any, Any, Any]

Needed in order to pickle this object.

Returns:

A tuple.

__setstate__

| __setstate__(state, **kwargs) -> None

Sets the state.

Arguments:

  • state - The state argument must be a sequence that contains the following elements version, shape, dtype, isFortan, rawdata.
  • **kwargs - Any additional parameter

get_shape_type_info

| get_shape_type_info() -> Tuple[
| List[
| Union[
| int,
| Tuple[None],
| Tuple[None, int],
| Tuple[None, None, int],
| Tuple[None, None, None, int],
| ]
| ],
| List[int],
| ]

Returns shapes and types needed to convert this feature array into tensors.

Returns:

A list of shape tuples. A list of type tuples.

FeatureSignature Objects

class FeatureSignature(NamedTuple)

Signature of feature arrays.

Stores the number of units, the type (sparse vs dense), and the number of dimensions of features.

RasaModelData Objects

class RasaModelData()

Data object used for all RasaModels.

It contains all features needed to train the models. 'data' is a mapping of attribute name, e.g. TEXT, INTENT, etc., and feature name, e.g. SENTENCE, SEQUENCE, etc., to a list of feature arrays representing the actual features. 'label_key' and 'label_sub_key' point to the labels inside 'data'. For example, if your intent labels are stored under INTENT -> IDS, 'label_key' would be "INTENT" and 'label_sub_key' would be "IDS".

__init__

| __init__(label_key: Optional[Text] = None, label_sub_key: Optional[Text] = None, data: Optional[Data] = None) -> None

Initializes the RasaModelData object.

Arguments:

  • label_key - the key of a label used for balancing, etc.
  • label_sub_key - the sub key of a label used for balancing, etc.
  • data - the data holding the features

get

| get(key: Text, sub_key: Optional[Text] = None) -> Union[Dict[Text, List[FeatureArray]], List[FeatureArray]]

Get the data under the given keys.

Arguments:

  • key - The key.
  • sub_key - The optional sub key.

Returns:

The requested data.

items

| items() -> ItemsView

Return the items of the data attribute.

Returns:

The items of data.

values

| values() -> Any

Return the values of the data attribute.

Returns:

The values of data.

keys

| keys(key: Optional[Text] = None) -> List[Text]

Return the keys of the data attribute.

Arguments:

  • key - The optional key.

Returns:

The keys of the data.

sort

| sort()

Sorts data according to its keys.

first_data_example

| first_data_example() -> Data

Return the data with just one feature example per key, sub-key.

Returns:

The simplified data.

does_feature_exist

| does_feature_exist(key: Text, sub_key: Optional[Text] = None) -> bool

Check if feature key (and sub-key) is present and features are available.

Arguments:

  • key - The key.
  • sub_key - The optional sub-key.

Returns:

False, if no features for the given keys exists, True otherwise.

does_feature_not_exist

| does_feature_not_exist(key: Text, sub_key: Optional[Text] = None) -> bool

Check if feature key (and sub-key) is present and features are available.

Arguments:

  • key - The key.
  • sub_key - The optional sub-key.

Returns:

True, if no features for the given keys exists, False otherwise.

is_empty

| is_empty() -> bool

Checks if data is set.

number_of_examples

| number_of_examples(data: Optional[Data] = None) -> int

Obtain number of examples in data.

Arguments:

  • data - The data.
  • Raises - A ValueError if number of examples differ for different features.

Returns:

The number of examples in data.

number_of_units

| number_of_units(key: Text, sub_key: Text) -> int

Get the number of units of the given key.

Arguments:

  • key - The key.
  • sub_key - The optional sub-key.

Returns:

The number of units.

add_data

| add_data(data: Data, key_prefix: Optional[Text] = None) -> None

Add incoming data to data.

Arguments:

  • data - The data to add.
  • key_prefix - Optional key prefix to use in front of the key value.

update_key

| update_key(from_key: Text, from_sub_key: Text, to_key: Text, to_sub_key: Text) -> None

Copies the features under the given keys to the new keys and deletes the old keys.

Arguments:

  • from_key - current feature key
  • from_sub_key - current feature sub-key
  • to_key - new key for feature
  • to_sub_key - new sub-key for feature

add_features

| add_features(key: Text, sub_key: Text, features: Optional[List[FeatureArray]]) -> None

Add list of features to data under specified key.

Should update number of examples.

Arguments:

  • key - The key
  • sub_key - The sub-key
  • features - The features to add.

add_lengths

| add_lengths(key: Text, sub_key: Text, from_key: Text, from_sub_key: Text) -> None

Adds a feature array of lengths of sequences to data under given key.

Arguments:

  • key - The key to add the lengths to
  • sub_key - The sub-key to add the lengths to
  • from_key - The key to take the lengths from
  • from_sub_key - The sub-key to take the lengths from

split

| split(number_of_test_examples: int, random_seed: int) -> Tuple["RasaModelData", "RasaModelData"]

Create random hold out test set using stratified split.

Arguments:

  • number_of_test_examples - Number of test examples.
  • random_seed - Random seed.

Returns:

A tuple of train and test RasaModelData.

get_signature

| get_signature(data: Optional[Data] = None) -> Dict[Text, Dict[Text, List[FeatureSignature]]]

Get signature of RasaModelData.

Signature stores the shape and whether features are sparse or not for every key.

Returns:

A dictionary of key and sub-key to a list of feature signatures (same structure as the data attribute).

as_tf_dataset

| as_tf_dataset(batch_size: int, batch_strategy: Text = SEQUENCE, shuffle: bool = False) -> tf.data.Dataset

Create tf dataset.

Arguments:

  • batch_size - The batch size to use.
  • batch_strategy - The batch strategy to use.
  • shuffle - Boolean indicating whether the data should be shuffled or not.

Returns:

The tf.data.Dataset.

prepare_batch

| prepare_batch(data: Optional[Data] = None, start: Optional[int] = None, end: Optional[int] = None, tuple_sizes: Optional[Dict[Text, int]] = None) -> Tuple[Optional[np.ndarray]]

Slices model data into batch using given start and end value.

Arguments:

  • data - The data to prepare.
  • start - The start index of the batch
  • end - The end index of the batch
  • tuple_sizes - In case the feature is not present we propagate the batch with None. Tuple sizes contains the number of how many None values to add for what kind of feature.

Returns:

The features of the batch.