This guide was written for Rasa version 2.6.2 with spaCy version 3.0.6. However, it should still be compatible with Rasa 3.x and spaCy 3.x.
Let's say that you're a financial organisation interested in building a virtual assistant. The virtual assistant will likely need to be able to detect certain entities: dates, bank accounts, addresses, as well as financial jargon related to mortgages. Odds are though, it's not just your virtual assistant that needs to detect these entities. This capability will also be relevant for other use cases in your organisation, like contract parsing, automated email forwarding in customer service and detecting fraud. It therefore makes sense to explore making a general model that can be re-used by many teams in your organisation.
SpaCy is a popular tool for such pipelines. In this guide, we're going to bootstrap a spaCy project and show you how you can integrate it with Rasa. If you're unfamiliar with spaCy feel free to check out the spaCy online course or spaCy introductory YouTube series. In particular, we're going to use the new projects feature from spaCy 3.0, which is explained in detail in this YouTube video.
Setup
Let's start by setting up a new Rasa project. This will be the project that we will configure to run with spaCy.
We won't make any changes to the Rasa project just yet. Instead, we will start a new spaCy project that will contain our custom entity detection model. We will use the ner demo starterpack from spaCy so that we can get started quickly.
The ner_demo_replace
package contains a spaCy project that will train an entity detection model. It only has a few examples to train on because it's merely meant for demonstration purposes, but the project structure is ready to run for more serious problems. Here's what the project folder contains.
š /Development/spacy-3-rasa-integration/ner_demo_replace
ā£āā š assets
ā£āā š configs
ā£āā š corpus
ā£āā š packages
ā£āā š scripts
ā£āā š training
ā£āā š project.yml (3.9 kB)
ā£āā š README.md (2.1 kB)
āāā š requirements.txt (58 bytes)
This folder represents a self-contained spaCy project. It contains all the required scripts and commands that we need to train a new model. To get started with our model, we'll first need to download the required training data. These belong in the assets directory, but this project is configured to automatically fetch the files we need when we run;
The training data is minimal and only meant for illustration purposes. It mainly contains json data with some examples where the term "horse" needs to be detected as an ANIMAL entity. The project can be extended with a more meaningful dataset, but for illustration purposes this dataset will work to show how to create a custom spaCy model.
With the data downloaded we can now run the required commands to train a model. The commands are configured in the project.yml file. You can preview the available commands via:
In our case, we're mainly interested in creating a new model and then packaging it up. So we'll run the all
command first. This will convert the training data to a binary format, configure, train and evaluate the trained model. Once that is done we'll package it with the package command.
This gives us a trained spaCy model. You can inspect it in the packages
folder inside of the spaCy project.
š ner_demo_replace/packages/en_ner_demo_replace-0.0.0
ā£āā š dist
ā£āā š en_ner_demo_replace
ā£āā š en_ner_demo_replace.egg-info
ā£āā š MANIFEST.in (33 bytes)
ā£āā š meta.json (2.6 kB)
āāā š setup.py (1.8 kB)
You'll notice that we've exported a setup.py file here as well. That means that we can install the trained spaCy model via pip to have it available in our virtual environment.
Now that the model is installed we can access it from spaCy within python too. It's load-able via the spacy.load
command.
Now that our custom model is load-able by spaCy, it is also load-able from Rasa.
Back to Rasa
Let's make a change to the NLU pipeline in our `config.yml` file so that we're able to use our own spaCy model.
The main part that is important here is that we're configuring a SpacyNLP
component in our pipeline that is referring to our own en_ner_demo_replace
model. Once this model is loaded it will be used by the SpacyTokenizer, SpacyFeaturizer and the SpacyEntityExtractor. If we can confirm this by training a new Rasa model and asking it to check for entities.
Let's start by training the Rasa pipeline.
Once it is trained, we can run the NLU model from the shell.
And there you have it! We've trained our own spaCy model and loaded it in Rasa.
Final Details
The example that we've used in this guide is not representative of a real-life problem. We've chosen this project as a demonstration because it's fast to get started with. The spaCy project can easily be extended however simply by replacing the json data that is used to train and validate the spaCy model. If you're interested in playing around with a larger model you might enjoy this project that detects fashion brands or this project that detects food ingredients.
If you're interested in doing a deep dive into spaCy models we recommend reading more about the config system. In our case we've kept things simple, but we could have customised it further. We could, for example, have configured our custom model to use the word vectors from the large English model. We could do that by changing the components.tok2vec
property in the configs/config.cfg
file.
We could even go a step further and change the settings of the internal spaCy model from this config file. The base settings use a convolution model, which is configured like below.
SpaCy allows you to try out all sorts of architectures here. It's good to be aware that you can customise your spaCy model to fit your use case. But! One key thing to remember is that while spaCy is highly configurable, your main concern should be to collect high quality training data to start with. As we often emphasize, you want to make sure that your training data reflects real user language and behavior-so your trained model will too.
The fact that you're able to re-use your spaCy model for many use cases in your organization is great news though. It might enable lessons learned from your customer service desk to be re-used in your virtual assistant and vise-versa.
Happy hacking!