Version: Latest

Metrics

New in 3.8

You can now measure the performance of your CALM assistant with OpenTelemetry-based metrics.

Metrics are measurements of a service captured at runtime that serve as indicators of availability and performance. Metrics can be used to monitor the health of a service, alert of an outage, and to understand the impact of changes to a service. Unlike tracing, metrics are intended to provide aggregated statistical information, such as average response time or throughput, across multiple messages and conversations.

Configuring Metrics

To enable metrics gathering in Rasa Pro, you must use the OTEL Collector (OpenTelemetry Collector) to collect metrics and then send them to the backend of your choice.

To configure the metrics OTEL collector, add a metrics entry to your endpoints i.e. in your endpoints.yml file, or in the relevant section of your Helm values in a deployment.

To configure an OTEL Collector, specify the type as otlp.

metrics:
  type: otlp
  endpoint: my-otlp-host:4318
  insecure: false
  service_name: rasa
  root_certificates: ./tests/unit/tracing/fixtures/ca.pem

Note that metrics must be used together with tracing to provide a complete view of your system.

Recorded Metrics

Supported custom metrics include:

CPU and memory usage of the LLMCommandGenerator component at the time of making a LLM call
LLMCommandGenerator prompt token usage (provided the trace_prompt_tokens config property is enabled)
method call duration measurements for LLM specific calls in components such as IntentlessPolicy, EnterpriseSearchPolicy, ContextualResponseRephraser, LLMCommandGenerator
rasa client http request duration (e.g. to the action server or nlg server)
rasa client http request size in bytes

Metrics#

New in 3.8

Configuring Metrics#

Recorded Metrics#

Metrics

Configuring Metrics

Recorded Metrics