Tensorboard in Rasa

With Rasa Open Source 1.9, we added support for TensorBoard 2. TensorBoard provides visualizations and tooling for machine learning experiments. In Rasa Open Source 1.9 we use TensorBoard to visualize training metrics of our in-house built machine learning models on top of Tensorflow 2. Visualizing training metrics will help you to understand if your model has trained properly. You can, for example, see if your model needs more epochs or if a validation set is showing different behavior from the train set.

How to enable TensorBoard?

To enable TensorBoard you can now add a setting for a model in your config.yml file. You can set this option for EmbeddingIntentClassifier, DIETClassifier, ResponseSelector or TEDPolicy. If a valid directory is provided, the training metrics will be written to that directory during training. By default we write the training metrics after every epoch but this is a setting you can tune. You can also specify a number of datapoints to be kept seperately for evaluation during training.

Let's take a look at an example

The following config was used to train Sara.

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: CRFEntityExtractor
  - name: CountVectorsFeaturizer
    OOV_token: "oov"
    token_pattern: (?u)\b\w+\b
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 200
    evaluate_on_number_of_examples: 0
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"

policies:
  - name: TEDPolicy
    max_history: 10
    epochs: 200
    evaluate_on_number_of_examples: 0
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"
  - max_history: 6
    name: AugmentedMemoizationPolicy
  - core_threshold: 0.3
    name: TwoStageFallbackPolicy
    nlu_threshold: 0.8
  - name: FormPolicy
  - name: MappingPolicy

Let's note a few things here.

Note that all the tensorboard settings are part of the DIETClassifier and the TEDPolicy component. In particular, note the setting for the tensorboard log directory.
Note that the DIETClassifier still has settings for the algorithm, independant of tensorboard. In particular, we will train for 200 epochs. Same goes for TEDPolicy.
We specified that we want to evaluate our model every 5 epochs on a validation dataset that has 0 examples. The evaluate_on_number_of_examples parameter allows you to remove a random subset of examples from the training data to be used as a validation dataset. This parameter should be used with care. A larger validation set means that we have less data for training which can be bad for performance. This parameter is meant for playing around, not for production, which is why we've set it to zero here.
We can change the log level to minibatch if we want to zoom in.

After you trained your model, for example via rasa train, all metrics are written to the provided directory. The directory will contain a subdirectory with the model name and another subdirectory with a timestamp. This allows you to reuse the same directory for multiple models and runs. To start TensorBoard execute the following command:

tensorboard --logdir <path-to-directory>

You can now open a browser at http://localhost:6006/ to see the training metrics. It should look something like this;

The orange curve corresponds to the hold out validation dataset and the blue curve shows the metrics for the training data (see legend on the left). If you use, for example the DIETClassifier, you will see plots for the following metrics: i_loss, i_acc, e_loss, e_f1, and t_loss. i is short for intent, e for entities, and t_loss shows the total loss. If you want to know more about what this means in terms of our DIET model, check out our youtube channel.

Let us know

TensorBoard is a new feature in Rasa and we might add further support for it. Until then we would love to hear your feedback and ideas of what else we could add to TensorBoard. You can leave feedback on our forum.