This is unreleased documentation for Rasa Open Source Documentation Master/Unreleased version.
For the latest released documentation, see the latest version (2.x).

Version: Master/Unreleased


LanguageModelFeaturizerGraphComponent Objects

class LanguageModelFeaturizerGraphComponent(DenseFeaturizer2, GraphComponent)

A featurizer that uses transformer-based language models.

This component loads a pre-trained language model from the Transformers library ( including BERT, GPT, GPT-2, xlnet, distilbert, and roberta. It also tokenizes and featurizes the featurizable dense attributes of each message.


def required_components(cls) -> List[Type]

Components that should be included in the pipeline before this component.


def __init__(config: Dict[Text, Any], execution_context: ExecutionContext) -> None

Initializes the featurizer with the model in the config.


def get_default_config() -> Dict[Text, Any]

Returns LanguageModelFeaturizer's default config.


def validate_config(cls, config: Dict[Text, Any]) -> None

Validates the configuration.


def create(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext) -> LanguageModelFeaturizerGraphComponent

Creates a LanguageModelFeaturizer.

Loads the model specified in the config.


def required_packages() -> List[Text]

Returns the extra python dependencies required.


def process_training_data(training_data: TrainingData, config: Optional[RasaNLUModelConfig] = None, **kwargs: Any, ,) -> TrainingData

Computes tokens and dense features for each message in training data.


  • training_data - NLU training data to be tokenized and featurized
  • config - NLU pipeline config consisting of all components.


def process(messages: List[Message]) -> List[Message]

Processes messages by computing tokens and dense features.