Metrics
Metrics
New in 3.8
You can now measure the performance of your CALM assistant with OpenTelemetry-based metrics.
Metrics are measurements of a service captured at runtime that serve as indicators of availability and performance. Metrics can be used to monitor the health of a service, alert of an outage, and to understand the impact of changes to a service. Unlike tracing, metrics are intended to provide aggregated statistical information, such as average response time or throughput, across multiple messages and conversations.
Configuring Metrics
To enable metrics gathering in Rasa Pro, you must use the OTEL Collector (OpenTelemetry Collector) to collect metrics and then send them to the backend of your choice.
To configure the metrics OTEL collector, add a metrics
entry to your endpoints
i.e. in your endpoints.yml
file, or in the relevant section of your Helm values in a deployment.
To configure an OTEL Collector, specify the type
as otlp
.
Note that metrics must be used together with tracing to provide a complete view of your system.
Recorded Metrics
Supported custom metrics include:
- CPU and memory usage of the
LLMCommandGenerator
component at the time of making a LLM call LLMCommandGenerator
prompt token usage (provided thetrace_prompt_tokens
config property is enabled)- method call duration measurements for LLM specific calls in components such as
IntentlessPolicy
,EnterpriseSearchPolicy
,ContextualResponseRephraser
,LLMCommandGenerator
- rasa client http request duration (e.g. to the action server or nlg server)
- rasa client http request size in bytes