Let's Talk about Tensorflow 2.6

Security

You might wonder how a computational library such as Tensorflow even has security vulnerabilities. After all, Tensorflow doesn’t communicate with the outside world via HTTP nor does it interact directly with the operating system. And yet, a long list of vulnerabilities exists.

Some of these vulnerabilities are known as Denial of Service vulnerabilities. In these scenarios, an attacker sends crafted requests that could cause the system to take a disproportionate amount of time to process or even crash. There’s an example here that lists an issue with calling .shape on a specific type of tensor that could cause a segfault.

Other vulnerabilities include Arbitrary Code Execution. These vulnerabilities allow unprivileged users to run commands or code of the attacker's choice on a target machine. This came up in the recent issue listed here. TensorFlow and Keras can be tricked to perform arbitrary code execution when deserializing a Keras model from YAML format. The original implementation used yaml.unsafe_load, which can perform arbitrary code execution on the input.

It deserves highlighting that none of the security issues are direct concerns to the Rasa community because they affect a part of the TensorFlow codebase that Rasa does not use. However, since Tensorflow is a dependency of Rasa many CI pipelines can break if the security check doesn’t pass.

Thankfully, the Tensorflow team is aware of these issues and has made patches available to the community in newer versions. This is the main reason why we have moved to Tensorflow 2.6 for future releases, which addresses many of the security issues that are blockers for enterprise users.

Performance

Whenever we upgrade to a newer version of Tensorflow we run benchmarks to ensure that the performance of the models does not deteriorate. Since we’re upgrading to Tensorflow 2.6 we ran our benchmarks again and compared them to the performance with Tensorflow 2.3.

The accuracy of our benchmark models did not decline but the training times and memory usage did increase. We ran our benchmarks on a Debian Linux VM with 4 virtual CPUs, 15 GB of memory with an NVIDIA Tesla K80 GPU. The benchmarks show the results from training our main intent and entity prediction model, called DIET. The relevant results are shown in the table below.

|                       |  CPU Time Increase  |  GPU Time Increase      |
|-----------------------|---------------------|-------------------------|
| DIET without Entities |                  4% |                      0% |
| DIET with Entities    |                 28% |                     48% |
|-----------------------|---------------------|-------------------------|

From our initial results, the increase in training time is most pronounced when you use DIET for multitasking intent and entity learning. If you’re only using DIET for intents you may not notice it as much.

Besides longer training times, there was also an increase in memory usage during training.

|                       |  CPU MEM Increase  |  GPU MEM Increase        |
|-----------------------|--------------------|--------------------------|
| DIET without Entities |                 0% |                       0% |
| DIET with Entities    |                12% |                       8% |
|-----------------------|--------------------|--------------------------|

We want to be upfront: the training times of your DIET models may increase when you upgrade Rasa this time around. The inference time may also increase, especially when the LanguageModelFeaturizer is used. This component uses pre-trained language models from the huggingface ecosystem which also depend on Tensorflow. We are in contact with the TensorFlow team and we hope that the performance issues will be resolved soon.

Conclusion

Rasa will upgrade from Tensorflow 2.3 to Tensorflow 2.6 starting with Rasa 2.8.9. We’ve made this change to address security concerns but there will be the side effect that your training times may increase if you use DIET to train entities.

If you have any feedback on the training times or your experience with TensorFlow 2.6, we would appreciate it if you shared it with us. We’ve opened up a thread on our forum where you can submit any findings that you may have and we’ll keep you updated on any future changes.