notice
This is documentation for Rasa & Rasa Pro Documentation v2.x, which is no longer actively maintained.
For up-to-date documentation, see the latest version (3.x).
rasa.nlu.featurizers.dense_featurizer.lm_featurizer
LanguageModelFeaturizer Objects
Featurizer using transformer-based language models.
The transformers(https://github.com/huggingface/transformers) library is used to load pre-trained language models like BERT, GPT-2, etc. The component also tokenizes and featurizes dense featurizable attributes of each message.
required_components
Packages needed to be installed.
__init__
Initializes LanguageModelFeaturizer with the specified model.
Arguments:
component_config
- Configuration for the component.skip_model_load
- Skip loading the model for pytests.hf_transformers_loaded
- Skip loading of model and metadata, use HFTransformers output instead.
load
Load this component from file.
After a component has been trained, it will be persisted by
calling persist
. When the pipeline gets loaded again,
this component needs to be able to restore itself.
Components can rely on any context attributes that are
created by :meth:components.Component.create
calls to components previous to this one.
This method differs from the parent method only in that it calls create rather than the constructor if the component is not found. This is to trigger the check for HFTransformersNLP and the method can be removed when HFTRansformersNLP is removed.
Arguments:
meta
- Any configuration parameter related to the model.model_dir
- The directory to load the component from.model_metadata
- The model's :class:rasa.nlu.model.Metadata
.cached_component
- The cached component.
Returns:
the loaded component
cache_key
Cache the component for future use.
Arguments:
component_meta
- configuration for the component.model_metadata
- configuration for the whole pipeline.Returns
- key of the cache for future retrievals.
required_packages
Packages needed to be installed.
train
Compute tokens and dense features for each message in training data.
Arguments:
training_data
- NLU training data to be tokenized and featurizedconfig
- NLU pipeline config consisting of all components.
process
Process an incoming message by computing its tokens and dense features.
Arguments:
message
- Incoming message object