notice
This is unreleased documentation for Rasa Documentation Main/Unreleased version.
For the latest released documentation, see the latest version (3.x).
rasa.nlu.featurizers.dense_featurizer.lm_featurizer
LanguageModelFeaturizer Objects
A featurizer that uses transformer-based language models.
This component loads a pre-trained language model from the Transformers library (https://github.com/huggingface/transformers) including BERT, GPT, GPT-2, xlnet, distilbert, and roberta. It also tokenizes and featurizes the featurizable dense attributes of each message.
required_components
Components that should be included in the pipeline before this component.
__init__
Initializes the featurizer with the model in the config.
get_default_config
Returns LanguageModelFeaturizer's default config.
validate_config
Validates the configuration.
create
Creates a LanguageModelFeaturizer.
Loads the model specified in the config.
required_packages
Returns the extra python dependencies required.
process_training_data
Computes tokens and dense features for each message in training data.
Arguments:
training_data
- NLU training data to be tokenized and featurizedconfig
- NLU pipeline config consisting of all components.
process
Processes messages by computing tokens and dense features.