Langfuse Integration
Langfuse is an open-source observability platform designed specifically for LLM applications. It provides comprehensive tracing, monitoring, and analytics capabilities that help you understand how your LLM-based components are performing in production.
Installation
Before configuring Langfuse, you must install the langfuse package. Langfuse is included as an optional dependency in the monitoring extra group.
To install Langfuse, use one of the following methods:
Using pip:
pip install rasa-pro[monitoring]
Using poetry:
poetry install --extras monitoring
Without installing the langfuse package, you will see an error message indicating that Langfuse is not available, and the integration will not be configured.
Configuration
Langfuse is configured through the endpoints.yml file by adding a langfuse entry to the tracing section.
You can configure multiple tracing backends simultaneously (e.g., both Langfuse and Jaeger).
Here is a basic configuration example:
tracing:
- type: langfuse
host: https://cloud.langfuse.com
public_key: ${LANGFUSE_PUBLIC_KEY}
private_key: ${LANGFUSE_PRIVATE_KEY}
Configuration Options
| Option | Required | Description | Example |
|---|---|---|---|
type | Yes | Must be set to langfuse | langfuse |
host | Yes | The Langfuse server URL | https://cloud.langfuse.com |
public_key | Yes | Your Langfuse public API key (must use ${VAR} syntax for environment variables) | ${LANGFUSE_PUBLIC_KEY} |
private_key | Yes | Your Langfuse private API key (must use ${VAR} syntax for environment variables) | ${LANGFUSE_PRIVATE_KEY} |
timeout | No | Request timeout in seconds | 30 |
debug | No | Enable debug logging | true or false |
environment | No | Environment label for traces | production, staging, development |
release | No | Release version identifier | v1.2.3 |
media_upload_thread_count | No | Number of threads for media uploads | 4 |
sample_rate | No | Sampling rate for traces (0.0 to 1.0) | 1.0 |
public_key and private_key must be set as environment variables, and in your endpoints.yml file, you reference them using the ${ENV_VAR_NAME} syntax. For example:
-
Set the environment variables in your shell or deployment environment:
export LANGFUSE_PUBLIC_KEY="<your-public-key>"
export LANGFUSE_PRIVATE_KEY="<your-private-key>" -
Reference these variables in your
endpoints.yml:endpoints.ymltracing:
- type: langfuse
host: https://cloud.langfuse.com
public_key: ${LANGFUSE_PUBLIC_KEY}
private_key: ${LANGFUSE_PRIVATE_KEY}
This ensures your secrets are not stored directly in configuration files but injected at runtime using environment variables.
Multiple Tracing Backends
You can configure both Langfuse and other tracing backends (like OTLP) simultaneously:
tracing:
- type: otlp
endpoint: my-otlp-host:4317
insecure: false
service_name: rasa
root_certificates: ./tests/unit/tracing/fixtures/ca.pem
- type: langfuse
host: https://cloud.langfuse.com
public_key: ${LANGFUSE_PUBLIC_KEY}
private_key: ${LANGFUSE_PRIVATE_KEY}
Traced Components
When Langfuse is configured, the following components automatically send traces:
LLM-Based Components
- Command Generators: All LLM-based command generators that generate dialogue commands
- Contextual Response Rephraser: Components that rephrase responses using LLMs
- Enterprise Search Policy: Policy that uses LLMs to generate responses from search results
- ReAct Sub Agent: MCP-based sub agents that use LLMs for reasoning and tool execution
- LLM-Based Router: Components that route conversations using LLMs
Embedding Operations
- Flow Retrieval: Semantic search operations when retrieving relevant flows
- Enterprise Search Policy: Vector search operations when finding relevant documents
Trace Contents
Each trace sent to Langfuse includes the following information:
Standard Trace Data
- Timestamp: When the LLM or embedding call was made
- Input: The prompt or query sent to the LLM/embedding model
- Output: The response or embedding vector returned
- Latency: Time taken for the request to complete
- Token Usage: Number of prompt tokens, completion tokens, and total tokens used
- Cost: Calculated cost based on token usage and model pricing
Metadata
Each trace includes rich metadata to help you organize and filter traces:
- Session ID: The conversation session identifier
- Tags: Component name for easy filtering (e.g.,
EnterpriseSearchPolicy,CompactLLMCommandGenerator) - Custom Metadata: A dictionary containing:
- Component Name: The class name of the component making the call
- Agent ID: The ID of the agent
- Model ID: The ID of the trained model being used
- ReAct Sub Agent Name: (For ReAct sub agents) The name of the sub-agent
Customizing Metadata
Custom components can override the get_llm_tracing_metadata() method to customize the metadata sent with each trace.