Extractive Search
Rasa now supports using EnterpriseSearchPolicy
without an additional call to LLMs for response generation.
This feature is only a beta (experimental) and will change in future Rasa Pro versions.
Extractive Search allows you to disable LLM response generation and use your questions and answer dataset to provide answers to user queries. This feature is useful when you want to provide answers to user queries from a predefined dataset without using LLMs for chat response generation. These dataset could look as follows:
Q: Who is Finley?
A: Finley is your smart assistant for the FinX App. You can add him to your favorite
messenger and tell him what you need help with.
Q: How does Finley work?
A: Finley is powered by the latest chatbot technology leveraging a unique interplay of
large language models and secure logic.
As the dataset already contains answers, LLM response generation is no longer needed and can be turned off. Extractive Search retrieves the most similar Question and Answer pair from the dataset.
How Extractive Search Works
Extractive Search requires documents to be ingested into a specific format so that answers can be reliably extracted by Rasa. These questions should be added as follows in the vector store:
[
{
"metadata": {
"title": "who_finley",
"type": "faq",
"answer": "Finley is your smart assistant for the FinX App. You can add him to your favorite messenger and tell him what you need help with.",
},
"page_content": "Who is Finley?",
},
{
"metadata": {
"title": "how_finley_work",
"type": "faq",
"answer": "Finley is powered by the latest chatbot technology leveraging a unique interplay of large language models and secure logic.",
},
"page_content": "How does Finley work?"
},
]
This format ensures that when users ask for the query “Who is finley?”, the retrieval returns who_finley
as the first result and Rasa can reliably extract the answer from metadata.answer
key.
Explanations for all keys:
metadata
: this is a mandatory field required by Enterprise Searchmetadata.title
: [optional] could be useful as an ID field to refer to the QnA pairmetadata.answer
: contains text or markdown used to create the response that is shown to the user.metadata.type
: optional field, it is useful to filter relevant documents if the knowledge base contains other things too.page_content
: contains the text Question from QnA pair, only this field is vectorised by the embedding model. Any search queryq
will be compared for similarity with this field in the payload.
Extractive Search should be used together with vector_store.threshold
so that only the high-confidence search results are used to respond to the user.
Extractive Search Configuration
To configure EnterpriseSearchPolicy
to use Extractive Search,
simply set use_generative_llm
to false
in the Assistant’s config.yml
policies:
...
- name: EnterpriseSearchPolicy
use_generative_llm: false
With this configuration, EnterpriseSearchPolicy
returns the first search result to the chat without generating an answer with an LLM.
You can also connect to different search services using Custom Information Retrievers while using Extractive Search.