At a fundamental level, natural language understanding (NLU) does two things: it identifies the goal or meaning of the text and extracts key pieces of information from it. In other words, NLU does intent identification and entity recognition. In this blog post, we'll talk about a new research feature called Entity Roles and Groups that we introduced with Rasa 1.10. This feature allows you to include additional labels to your training data to define certain concepts and make your AI assistant more accurate.
What are Entity Roles?
Let's look at this conversation between a traveler and carbon bot, an assistant that determines your carbon footprint and directs you to a site where you can purchase carbon offset credits.
Carbon bot starts by asking if the traveler is interested in buying carbon offsets. After confirming they'd like to proceed, or identifying the intent as affirmative, carbon bot asks a few questions about their travel, like cities they'll be traveling to and from. The cities in this example, Berlin and London, are entities we want the assistant to recognize. Named entity recognition is the task of extracting named entities from text. Named entities are things like locations, organizations, or personal names.
In order for our assistant to recognize Berlin and London as cities, we annotate them in our training data.
But, sometimes this information isn't enough. Let's assume carbon bot, in addition to letting you purchase carbon offsets, can also book flights. It would need to know more information, like the destination and departure cities. That is, it would need to know the role 'Berlin' and 'London' play in our example. Since these two cities play different roles--Berlin is the departure city and London is the destination city--we would need to assign a role label in addition to the entity label. Entity Roles and Groups allows you to do just this by giving you the option to assign a second label to a token.
What are Entity Groups?
As the name suggests, Entity Roles and Groups, also lets you group different entities by specifying a group label in addition to the entity label. In addition, the group label can be used, for example, to define an order. Let's assume we want carbon bot to order us a couple of pizzas as soon as we land in London.
In order for the assistant to distinguish between the two pizzas we ordered--a small mushroom, and a large pepperoni--we want to use a group label in addition to the entity label to define that relationship. In this example, group 1 references the small mushroom and group 2 references the large pepperoni.
Where are the receipts?
Entity Roles and Groups is a useful feature that allows you to further define concepts within your training data to make your AI assistant perform better. To validate this feature, our research team ran several experiments; one of the most important first steps is to find the right training dataset. In choosing the right dataset, the team looks at public datasets and ones created by Rasa, and also looks for:
- Suitability: is the dataset suitable for the task we want to evaluate?
- Adaptability: can we adapt an existing dataset to fit our needs?
- Feasibility: if an existing dataset won't do, is it feasible to create a new dataset
To perform our experiments, we used SNIPS and conversations from carbon bot. We played around with a few ideas; one where we concatenated entity and role labels, one where we used sigmoid function and cross entropy instead of using a conditional random field (CRF), and finally one where we used one CRF for entity labels and one CRF for role and group labels. The logics of the first CRF were added as features to the second CRF.
We decided to not proceed with releasing the feature if:
- The model couldn't distinguish between entities' roles and groups
- The training time increased significantly
However, the results were very promising. Using one CRF for entity labels and one for role labels worked best on both datasets and resulted in an F1 score that was a point higher compared to the other approaches. The results showed that entities' roles and groups were being detected quite well and that while the training time increased only slightly, there was no difference in inference time. In case of the role label, the model might in some cases, overfit specific entities to a specific role. You could try to use masking and exclude features from the second CRF to see if there's some improvement.
How to Use Entity Roles and Groups
First, upgrade to Rasa > 1.10.0 and update your training data to include entities with role and group labels. Your pipeline should either specify the CRFEntityExtractor
or the DIETClassifier
as these are the only entity extractors that can detect role and group labels. As you may have noticed, the training data format looks slightly different. Take a look at our documentation for instructions on how to define entity roles and groups in your training data.
If you want to see this feature in action, you'll want to fill a certain slot with an entity that has a specific role and/or group. You could use either forms, which we recommend, or custom actions for this.
Forms
Define a custom slot mapping to fill a slot with an entity that has a specific role and/or group label. Let's revisit our flight booking assistant. In this example, we want the assistant to fill two slots destination_city
and departure_city
.
To do this, let's use the helper method from_entity
in your slot mapping:
def slot_mappings(self) -> Dict[Text, Union[Dict, List[Dict]]]:
return {
"departure_city": [
self.from_entity(entity="city", role="departure"),
],
"destination_city": [
self.from_entity(entity="city", role="destination"),
]
}
These helper methods tell Rasa how to fill specific slots from entities. When the city
entity that has the role departure
is identified, the corresponding slot departure_city
is filled. Similarly, when the city
entity with the role destination
is identified, the corresponding slot destination_city
is filled. Once all the required slots are filled, the form's submit method is called. You can use the slot values--available in the submit method--to execute a custom action.
Custom Actions
The current entity values can be directly obtained from the tracker's get_latest_entity_values
method. You can optionally pass the role label and/or group label next to the entity type to that method. For example:
tracker.get_latest_entity_values(entity_type="city", entity_role="destination")
When the city entity has the role departure, its current value is returned.
Next Steps
Entity Roles and Groups is a recently shipped yet experimental research feature that started as a GitHub pull request from our community. One of our community members posted that they implemented a similar feature as a custom component within Rasa and wanted to see native support for entities roles and groups. A year later, this feature has been implemented, validated, and released into the product. Community members often post ideas on how to better the product on the forum. We encourage you to try it out and give us feedback. Better yet, send us some pull requests.
Happy coding!