notice

This is unreleased documentation for Rasa Open Source Documentation Main/Unreleased version.
For the latest released documentation, see the latest version (3.x).

Version: Main/Unreleased

rasa.nlu.featurizers.dense_featurizer.lm_featurizer

LanguageModelFeaturizer Objects

@DefaultV1Recipe.register(
DefaultV1Recipe.ComponentType.MESSAGE_FEATURIZER, is_trainable=False
)
class LanguageModelFeaturizer(DenseFeaturizer, GraphComponent)

A featurizer that uses transformer-based language models.

This component loads a pre-trained language model from the Transformers library (https://github.com/huggingface/transformers) including BERT, GPT, GPT-2, xlnet, distilbert, and roberta. It also tokenizes and featurizes the featurizable dense attributes of each message.

required_components

| @classmethod
| required_components(cls) -> List[Type]

Components that should be included in the pipeline before this component.

__init__

| __init__(config: Dict[Text, Any], execution_context: ExecutionContext) -> None

Initializes the featurizer with the model in the config.

get_default_config

| @staticmethod
| get_default_config() -> Dict[Text, Any]

Returns LanguageModelFeaturizer's default config.

validate_config

| @classmethod
| validate_config(cls, config: Dict[Text, Any]) -> None

Validates the configuration.

create

| @classmethod
| create(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext) -> LanguageModelFeaturizer

Creates a LanguageModelFeaturizer.

Loads the model specified in the config.

required_packages

| @staticmethod
| required_packages() -> List[Text]

Returns the extra python dependencies required.

process_training_data

| process_training_data(training_data: TrainingData) -> TrainingData

Computes tokens and dense features for each message in training data.

Arguments:

  • training_data - NLU training data to be tokenized and featurized
  • config - NLU pipeline config consisting of all components.

process

| process(messages: List[Message]) -> List[Message]

Processes messages by computing tokens and dense features.