Version: 3.x
rasa.nlu.tokenizers.jieba_tokenizer
JiebaTokenizer Objects
@DefaultV1Recipe.register(
DefaultV1Recipe.ComponentType.MESSAGE_TOKENIZER, is_trainable=True
)
class JiebaTokenizer(Tokenizer)
This tokenizer is a wrapper for Jieba (https://github.com/fxsjy/jieba).
supported_languages
| @staticmethod
| supported_languages() -> Optional[List[Text]]
Supported languages (see parent class for full docstring).
get_default_config
| @staticmethod
| get_default_config() -> Dict[Text, Any]
Returns default config (see parent class for full docstring).
__init__
| __init__(config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource) -> None
Initialize the tokenizer.
create
| @classmethod
| create(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext) -> JiebaTokenizer
Creates a new component (see parent class for full docstring).
required_packages
| @classmethod
| required_packages(cls) -> List[Text]
Any extra python dependencies required for this component to run.
train
| train(training_data: TrainingData) -> Resource
Copies the dictionary to the model storage.
tokenize
| tokenize(message: Message, attribute: Text) -> List[Token]
Tokenizes the text of the provided attribute of the incoming message.
load
| @classmethod
| load(cls, config: Dict[Text, Any], model_storage: ModelStorage, resource: Resource, execution_context: ExecutionContext, **kwargs: Any, ,) -> JiebaTokenizer
Loads a custom dictionary from model storage.
persist
| persist() -> None
Persist the custom dictionaries.