notice
This is unreleased documentation for Rasa Documentation Main/Unreleased version.
For the latest released documentation, see the latest version (3.x).
Setting up LLMs
Instructions on how to setup and configure Large Language Models from OpenAI, Cohere, and other providers. Here you'll learn what you need to configure and how you can customize LLMs to work efficiently with your specific use case.
Rasa Labs access - New in 3.7.0b1
Rasa Labs features are experimental. We introduce experimental features to co-create with our customers. To find out more about how to participate in our Labs program visit our Rasa Labs page.
We are continuously improving Rasa Labs features based on customer feedback. To benefit from the latest bug fixes and feature improvements, please install the latest pre-release using:
Overview
This guide will walk you through the process of configuring Rasa to use OpenAI LLMs, including deployments that rely on the Azure OpenAI service. Instructions for other LLM providers are further down the page.
Prerequisites
Before beginning, make sure that you have:
- Access to OpenAI's services
- Ability to generate API keys for OpenAI
Configuration
Configuring LLMs to work with OpenAI involves several steps. The following sub-sections outline each of these steps and what you need to do.
API Token
The API token is a key element that allows your Rasa instance to connect and communicate with OpenAI. This needs to be configured correctly to ensure seamless interaction between the two.
To configure the API token, follow these steps:
If you haven't already, sign up for an account on the OpenAI platform.
Navigate to the OpenAI Key Management page, and click on the "Create New Secret Key" button to initiate the process of obtaining your API key.
To set the API key as an environment variable, you can use the following command in a terminal or command prompt:
- Linux/MacOS
- Windows
export OPENAI_API_KEY=<your-api-key>Replace
<your-api-key>
with the actual API key you obtained from the OpenAI platform.
Model Configuration
Rasa allow you to use different models for different components. For example, you might use one model for intent classification and another for rephrasing.
To configure models per component, follow these steps described on the pages for each component:
- Instructions to configure models for intent classification
- Instructions to configure models for rephrasing
Additional Configuration for Azure OpenAI Service
For those using Azure OpenAI Service, there are additional parameters that need to be configured:
openai.api_type
: This should be set to "azure" to indicate the use of Azure OpenAI Service.openai.api_base
: This should be the URL for your Azure OpenAI instance. An example might look like this: "https://docs-test-001.openai.azure.com/".
To configure these parameters, follow these steps:
To configure the
openai.api_type
as an environment variable:- Linux/MacOS
- Windows
export OPENAI_API_TYPE="azure"To configure the
openai.api_base
as an environment variable:- Linux/MacOS
- Windows
export OPENAI_API_BASE=<your-azure-openai-instance-url>
Other LLMs & Embeddings
The LLM and embeddings provider can be configured separately for each component. All components default to using OpenAI.
important
If you switch to a different LLM / embedding provider, you need to go through additional installation and setup. Please note the mentioned additional requirements for each provider in their respective section.
caution
We are currently working on adding support for other LLM providers. We support configuring alternative LLM and embedding providers, but we have tested the functionality with OpenAI only.
Configuring an LLM provider
The LLM provider can be configured using the llm
property of each component.
The llm.type
property specifies the LLM provider to use.
The above configuration specifies that the LLMIntentClassifier should use the Cohere LLM provider rather than OpenAI.
The following LLM providers are supported:
OpenAI
Default LLM provider. Requires the OPENAI_API_KEY
environment variable to be set.
The model cam be configured as an optional parameter
Cohere
Support for Cohere needs to be installed, e.g. using pip install cohere
.
Additionally, requires the COHERE_API_KEY
environment variable to be set.
Vertex AI
To use Vertex AI you need to install pip install google-cloud-aiplatform
The credentials for Vertex AI can be configured as described in the
google auth documentation.
Hugging Face Hub
The Hugging Face Hub LLM uses models from Hugging Face.
It requires additional packages to be installed: pip install huggingface_hub
.
The environment variable HUGGINGFACEHUB_API_TOKEN
needs to be set to a
valid API token.
llama-cpp
To use the llama-cpp language model, you should install the required python library
pip install llama-cpp-python
. A path to the Llama model must be provided.
For more details, check out the llama-cpp project.
Other LLM providers
If you want to use a different LLM provider, you can specify the name of the
provider in the llm.type
property accoring to this mapping.
Configuring an embeddings provider
The embeddings provider can be configured using the embeddings
property of each
component. The embeddings.type
property specifies the embeddings provider to use.
The above configuration specifies that the LLMIntentClassifier should use the Cohere embeddings provider rather than OpenAI.
Only Some Components need Embeddings
Not every component uses embeddings. For example, the
LLMResponseRephraser component does not use embeddings.
For these components, no embeddings
property is needed.
The following embeddings providers are supported:
OpenAI
Default embeddings. Requires the OPENAI_API_KEY
environment variable to be set.
The model cam be configured as an optional parameter
Cohere
Embeddings from Cohere. Requires the python package
for cohere to be installed, e.g. uing pip install cohere
. The
COHERE_API_KEY
environment variable must be set. The model
can be configured as an optional parameter.
spaCy
The spacy embeddings provider uses en_core_web_sm
model to generate
embeddings. The model needs to be installed separately, e.g. using
python -m spacy download en_core_web_sm
.
Vertex AI
To use Vertex AI you need to install pip install google-cloud-aiplatform
The credentials for Vertex AI can be configured as described in the
google auth documentation.
Hugging Face Instruct
The Hugging Face Instruct embeddings provider uses sentence transformers
and requires additional packages to be installed: pip install sentence_transformers InstructorEmbedding
Hugging Face Hub
The Hugging Face Hub embeddings provider uses models from Hugging Face.
It requires additional packages to be installed: pip install huggingface_hub
.
The environment variable HUGGINGFACEHUB_API_TOKEN
needs to be set to a
valid API token.
llama-cpp
To use the llama-cpp embeddings, you should install the required python library
pip install llama-cpp-python
. A path to the Llama model must be provided.
For more details, check out the llama-cpp project.