Skip to main content

PII Management Prerequisites

Prerequisites

To use the PII management capability, you need to have the following prerequisites in place:

  • Rasa Pro version 3.13.0 or later installed.
  • defined privacy YAML config in your Rasa Pro project.

GLiNER Requirements

To use the GLiNER PII identification, you need to have the following prerequisites in place:

  • Rasa Pro version 3.13.0 or later installed with the pii optional extra, for example:
    uv pip install "rasa-pro[pii]"
    poetry add rasa-pro -E pii
  • GLiNER PII model downloaded and available in your Rasa Pro project.

GLiNER Model Download

To download the GLiNER PII model prior to starting the Rasa Pro assistant, run the following script in your Rasa Pro venv:

download_model.py
from pathlib import Path

from gliner import GLiNER
import os

def download_model(model_path: Path, model_name: str) -> None:
"""Download a Gliner model to the specified directory."""
# Check if the directory already exists
if not os.path.exists(model_path):
# Create the directory
os.makedirs(model_path)
model = GLiNER.from_pretrained(model_name)
model.save_pretrained(model_path)


if __name__ == "__main__":
local_model_path = Path("./gliner_model").resolve() # You can modify this path to your desired location
download_model(
model_path=local_model_path,
model_name="urchade/gliner_multi_pii-v1"
)

This script downloads the GLiNER PII model and saves it to the specified directory. Then set the GLINER_MODEL_PATH environment variable to the path where the model is saved.

To create a custom Docker image for the assistant, you can modify the Dockerfile in your Rasa Pro root project to include the model download script:

Dockerfile
FROM rasa/rasa-pro:3.13.0 AS builder
USER root

# Install dependencies
RUN python3 -m venv /opt/venv && \
. /opt/venv/bin/activate && \
pip install --no-cache-dir -U "pip==24.*" && \
pip install --no-cache-dir "rasa-pro[pii]===3.13.0" && \
# pin transformers to avoid connection errors with their \
# OpenTelemetry metrics collector occurring from >=4.53.0
pip install --no-cache-dir "transformers==4.52.4"

# Download HF model
COPY ./gliner_model_download_script.py gliner_model_download_script.py
RUN python3 -m gliner_model_download_script

FROM builder
WORKDIR /app
COPY --from=builder /app/gliner_model /app/gliner_model
RUN --mount=type=bind,target=/app/gliner_model

# Update permissions
RUN chown -R 1001:0 /app/gliner_model && \
chmod -R g=u /app/gliner_model && \
chmod o+wr /app/gliner_model
USER 1001
# Check if the model files are downloaded correctly
RUN ls -l /app/gliner_model

After building the Docker image: docker build . -t pii-assistant:latest, you can run the assistant with the GLiNER PII model available in your Rasa Pro project using this docker compose file which also requires BOT_PATH environment variable to be set to the path of your Rasa Pro project:

docker-compose.yml
x-license: &license
RASA_PRO_LICENSE: ${RASA_PRO_LICENSE}
OPENAI_API_KEY: ${OPENAI_API_KEY}
x-pii-config: &pii-env-vars
USER_CHAT_INACTIVITY_IN_MINUTES: 15
GLINER_MODEL_PATH: "/app/gliner_model"

services:
pii_assistant:
image: pii-assistant:latest
container_name: pii_assistant
volumes:
- "${BOT_PATH}:/app/bot"
working_dir: /app/bot
entrypoint: ""
command: rasa run -p 5005 --enable-api
ports:
- "5005:5005"
networks:
- default
environment:
<<: [ *license, *pii-env-vars ]
healthcheck:
test: curl localhost:5005 || exit 1
interval: 10s
retries: 10
start_period: 15s
timeout: 10s
user: "rasa"

Machine Spec Requirements with GLiNER

The Rasa Pro image with pre-downloaded GLiNER model will be larger than the standard Rasa Pro image due to the GLiNER model size. If you are running the Rasa Pro assistant with the GLiNER model in a Docker container, ensure that your machine has sufficient resources to run the model.

In addition, if you are running your Rasa Pro assistant with multiple Sanic workers, the GLiNER model will be loaded into memory for each worker and will require additional 1 - 2GB of memory per worker. Ensure to choose your assistant's number of Sanic workers according to the available memory on your machine. For example, a machine with 32GB of RAM can run up to 10 Sanic workers with the GLiNER model loaded in memory.

Environment Variables

You can configure the PII management capability using the following environment variables:

  • USER_CHAT_INACTIVITY_IN_MINUTES: The number of minutes after which the user chat session is considered inactive. Default is 30 minutes. This is used to determine if the session should be processed for PII anonymization or deletion in the tracker store.
  • GLINER_MODEL_PATH: The path to the downloaded GLiNER PII model in your Rasa Pro project.
  • HUGGINGFACE_HUB_CACHE_DIR: The path to the HuggingFace Hub cache directory if defined when downloading the GLiNER model.