notice

This is unreleased documentation for Rasa Open Source Documentation Master/Unreleased version.
For the latest released documentation, see the latest version (2.x).

Version: Master/Unreleased

rasa.nlu.tokenizers.tokenizer

TokenizerGraphComponent Objects

class TokenizerGraphComponent(GraphComponent, abc.ABC)

Base class for tokenizers.

__init__

def __init__(config: Dict[Text, Any]) -> None

Construct a new tokenizer.

create

@classmethod
def create(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext) -> GraphComponent

Creates a new component (see parent class for full docstring).

tokenize

@abc.abstractmethod
def tokenize(message: Message, attribute: Text) -> List[Token]

Tokenizes the text of the provided attribute of the incoming message.

process_training_data

def process_training_data(training_data: TrainingData) -> TrainingData

Tokenize all training data.

process

def process(messages: List[Message]) -> List[Message]

Tokenize the incoming messages.