notice

This is unreleased documentation for Rasa Open Source Documentation Main/Unreleased version.
For the latest released documentation, see the latest version (3.x).

Version: Main/Unreleased

rasa.nlu.tokenizers.jieba_tokenizer

JiebaTokenizer Objects

@DefaultV1Recipe.register(
DefaultV1Recipe.ComponentType.MESSAGE_TOKENIZER, is_trainable=True
)
class JiebaTokenizer(Tokenizer)

This tokenizer is a wrapper for Jieba (https://github.com/fxsjy/jieba).

supported_languages

| @staticmethod
| supported_languages() -> Optional[List[Text]]

Supported languages (see parent class for full docstring).

get_default_config

| @staticmethod
| get_default_config() -> Dict[Text, Any]

Returns default config (see parent class for full docstring).

__init__

| __init__(config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource) -> None

Initialize the tokenizer.

create

| @classmethod
| create(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext) -> JiebaTokenizer

Creates a new component (see parent class for full docstring).

required_packages

| @classmethod
| required_packages(cls) -> List[Text]

Any extra python dependencies required for this component to run.

train

| train(training_data: TrainingData) -> Resource

Copies the dictionary to the model storage.

tokenize

| tokenize(message: Message, attribute: Text) -> List[Token]

Tokenizes the text of the provided attribute of the incoming message.

load

| @classmethod
| load(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext, **kwargs: Any, ,) -> JiebaTokenizer

Loads a custom dictionary from model storage.

persist

| persist() -> None

Persist the custom dictionaries.