Version: Latest

Full-Text Search with Rasa

Out of the box, Rasa Enterprise Search supports Retrieval Augmented Generation solutions with vector stores like Faiss, Milvus, and Qdrant. It also supports customizing your information retrieval,

This example extends Rasa Enterprise Search to use PostgreSQL Full Text Search.

Set up PostgreSQL database

Create a PostgreSQL database using the default user and database. Use the password "rasa" for this example.

docker run -d -e POSTGRES_PASSWORD=rasa -p 5432:5432 postgres:16

Connect your database from the command line, using the PostgresSQL CLI:

psql -h 127.0.0.1 -U postgres -d postgres

The following script creates a table in the database and loads a dataset from huggingface which consists of a set of GitHub issues.

Run python ingest-data.py to load the data into your running PostgreSQL instance.

ingest-data.py
from datasets import load_dataset
from psycopg import Connection
conn = Connection.connect(
"postgresql://postgres:rasa@127.0.0.1/postgres", autocommit=True
)
def setup():
with conn.cursor() as cur:
cur.execute("DROP TABLE issues")
cur.execute("""
CREATE TABLE IF NOT EXISTS issues (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT,
title_search TSVECTOR GENERATED ALWAYS AS (to_tsvector('english', title)) STORED,
body_search TSVECTOR GENERATED ALWAYS AS (to_tsvector('english', body)) STORED
);
""")
conn.commit()
def load():
ds = load_dataset("JonathanSum/github-issues")
sql = "INSERT INTO issues (title, body) VALUES(%s, %s);"
with conn.cursor() as cur:
for row in ds["train"]:
cur.execute(sql, (row["title"], row["body"]))
conn.commit()
def search():
with conn.cursor() as cur:
sql = """
SELECT title, body, ts_rank_cd(body_search, to_tsquery(%s)) AS score
FROM issues
WHERE body_search @@ to_tsquery(%s)
ORDER BY score LIMIT 5
"""
query = "linux"
cur.execute(sql, (query, query))
for title, body, score in cur.fetchall():
import pdb; pdb.set_trace()
if __name__ == "__main__":
import sys
if len(sys.argv) > 1 and "load" in sys.argv[1]:
setup()
load()
elif len(sys.argv) > 1 and "search" in sys.argv[1]:
search()
else:
print(f"usage: {sys.argv[0]} [load|search]")

Create Rasa custom retrieval component

The following Python code allows Rasa to query the data loaded in the previous step. Create directory addons in your assistant and put this python code in file addons/faq.py:

addons/faq.py
import logging
from typing import Any, Dict, List
from psycopg import Connection
from rasa.utils.endpoints import EndpointConfig
from rasa.core.information_retrieval import (
SearchResultList,
SearchResult,
InformationRetrieval,
)
logger = logging.getLogger(__name__)
class PostgresIR(InformationRetrieval):
"""
Custom InformationRetrieval class using postgres fulltext search.
"""
def connect(self, config: EndpointConfig) -> None:
"""Parameter `pguri` is defined in file `endpoints.yml`."""
self.conn = Connection.connect(config.kwargs["pguri"])
logger.info("Got database connection: %s", self.conn)
async def search(
self, query: str, tracker_state: Dict[str, Any], threshold: float = 0.0
) -> SearchResultList:
"""Search the database for relevant results. Order by score."""
sr = []
logger.info("Search query %s, threshold %s", query, threshold)
with self.conn.cursor() as cur:
sql = """
SELECT title, body, ts_rank_cd(body_search, phraseto_tsquery('english', %s)) AS score
FROM issues
WHERE body_search @@ phraseto_tsquery('english', %s)
ORDER BY score LIMIT 3
"""
cur.execute(sql, (query, query))
for title, body, score in cur.fetchall():
logger.info("Search result: %s, %s, %s", title, body, score)
sr.append(SearchResult(text=body, score=score, metadata={}))
return SearchResultList(results=sr, metadata={})

Customize the EnterpriseSearchPolicy to use the new python code:

config.yml
policies:
- name: FlowPolicy
- name: IntentlessPolicy
- name: EnterpriseSearchPolicy
vector_store:
type: "addons.faq.PostgresIR"

Add the PostgreSQL credentials to endpoints.yml:

endpoints.yml
vector_store:
pguri: "postgresql://postgres:rasa@127.0.0.1/postgres"

Enable enterprise search by overriding the default pattern_search flow:

data/flows/search.yml
flows:
pattern_search:
description: handle a knowledge-based question or request
name: pattern search
steps:
- action: action_trigger_search

Train and run your assistant.

rasa train --domain domain
rasa inspect --debug

A sample chat session looks like this:

chat session showing custom information retrieval responses