PII Management Prerequisites
Prerequisites
To use the PII management capability, you need to have the following prerequisites in place:
- Rasa Pro version 3.13.0 or later installed.
- defined
privacy
YAML config in your Rasa Pro project.
GLiNER Requirements
To use the GLiNER PII identification, you need to have the following prerequisites in place:
- Rasa Pro version 3.13.0 or later installed with the
pii
optional extra, for example:uv pip install "rasa-pro[pii]"
poetry add rasa-pro -E pii - GLiNER PII model downloaded and available in your Rasa Pro project.
GLiNER Model Download
To download the GLiNER PII model prior to starting the Rasa Pro assistant, run the following script in your Rasa Pro venv:
from pathlib import Path
from gliner import GLiNER
import os
def download_model(model_path: Path, model_name: str) -> None:
"""Download a Gliner model to the specified directory."""
# Check if the directory already exists
if not os.path.exists(model_path):
# Create the directory
os.makedirs(model_path)
model = GLiNER.from_pretrained(model_name)
model.save_pretrained(model_path)
if __name__ == "__main__":
local_model_path = Path("./gliner_model").resolve() # You can modify this path to your desired location
download_model(
model_path=local_model_path,
model_name="urchade/gliner_multi_pii-v1"
)
This script downloads the GLiNER PII model and saves it to the specified directory.
Then set the GLINER_MODEL_PATH
environment variable to the path where the model is saved.
To create a custom Docker image for the assistant, you can modify the Dockerfile
in your Rasa Pro root project to include the model download script:
FROM rasa/rasa-pro:3.13.0 AS builder
USER root
# Install dependencies
RUN python3 -m venv /opt/venv && \
. /opt/venv/bin/activate && \
pip install --no-cache-dir -U "pip==24.*" && \
pip install --no-cache-dir "rasa-pro[pii]===3.13.0" && \
# pin transformers to avoid connection errors with their \
# OpenTelemetry metrics collector occurring from >=4.53.0
pip install --no-cache-dir "transformers==4.52.4"
# Download HF model
COPY ./gliner_model_download_script.py gliner_model_download_script.py
RUN python3 -m gliner_model_download_script
FROM builder
WORKDIR /app
COPY --from=builder /app/gliner_model /app/gliner_model
RUN --mount=type=bind,target=/app/gliner_model
# Update permissions
RUN chown -R 1001:0 /app/gliner_model && \
chmod -R g=u /app/gliner_model && \
chmod o+wr /app/gliner_model
USER 1001
# Check if the model files are downloaded correctly
RUN ls -l /app/gliner_model
After building the Docker image: docker build . -t pii-assistant:latest
, you can run the assistant with the GLiNER
PII model available in your Rasa Pro project using this docker compose file which also requires BOT_PATH
environment variable
to be set to the path of your Rasa Pro project:
x-license: &license
RASA_PRO_LICENSE: ${RASA_PRO_LICENSE}
OPENAI_API_KEY: ${OPENAI_API_KEY}
x-pii-config: &pii-env-vars
USER_CHAT_INACTIVITY_IN_MINUTES: 15
GLINER_MODEL_PATH: "/app/gliner_model"
services:
pii_assistant:
image: pii-assistant:latest
container_name: pii_assistant
volumes:
- "${BOT_PATH}:/app/bot"
working_dir: /app/bot
entrypoint: ""
command: rasa run -p 5005 --enable-api
ports:
- "5005:5005"
networks:
- default
environment:
<<: [ *license, *pii-env-vars ]
healthcheck:
test: curl localhost:5005 || exit 1
interval: 10s
retries: 10
start_period: 15s
timeout: 10s
user: "rasa"
Machine Spec Requirements with GLiNER
The Rasa Pro image with pre-downloaded GLiNER model will be larger than the standard Rasa Pro image due to the GLiNER model size. If you are running the Rasa Pro assistant with the GLiNER model in a Docker container, ensure that your machine has sufficient resources to run the model.
In addition, if you are running your Rasa Pro assistant with multiple Sanic workers, the GLiNER model will be loaded into memory for each worker and will require additional 1 - 2GB of memory per worker. Ensure to choose your assistant's number of Sanic workers according to the available memory on your machine. For example, a machine with 32GB of RAM can run up to 10 Sanic workers with the GLiNER model loaded in memory.
Environment Variables
You can configure the PII management capability using the following environment variables:
USER_CHAT_INACTIVITY_IN_MINUTES
: The number of minutes after which the user chat session is considered inactive. Default is30
minutes. This is used to determine if the session should be processed for PII anonymization or deletion in the tracker store.GLINER_MODEL_PATH
: The path to the downloaded GLiNER PII model in your Rasa Pro project.HUGGINGFACE_HUB_CACHE_DIR
: The path to the HuggingFace Hub cache directory if defined when downloading the GLiNER model.