Version: Latest

Tracing

Tracing

Distributed tracing tracks requests as they flow through a distributed system (in this case: a Rasa assistant), sending data about the requests to a tracing backend which collects all trace data and enables inspecting it. Trace data helps you understand the flow of requests through both the components of a single service (Rasa itself), and across different distributed services, for example, your action server.

Supported Tracing Backends/Collectors

To trace requests in Rasa Pro, you can either use Jaeger as a backend, or use the OTEL Collector (OpenTelemetry Collector). to collect traces and then send them to the backend of your choice. See Configuring a Tracing Backend or Collector for instructions.

Rasa Channels

Trace context sent along with requests using the W3C Trace Context Specification via the REST channel is used to continue tracing in Rasa Pro.

Rasa Inspector

If you have enabled tracing in Rasa Pro and are using the Rasa Inspector debugging tool to try your assistant, note that in addition to the expected tracing span for the Agent.handle_message method call, the tracing backend will collect independent tracing spans for the MessageProcessor.get_tracker method calls. This is expected behaviour because the Rasa Inspector tool uses the Rasa HTTP API endpoints to retrieve the conversation tracker which is required by the Inspector interface.

Action Server

The trace context from Rasa Pro is sent along with requests to the custom action server using the W3C Trace Context Specification and then used to continue tracing the request through the custom action server.

Tracing is continued in the action server by instrumenting the webhook that receives custom actions. See Action server attributes for the attributes captured as part of the trace context.

See traced events for details on what attributes are made available as part of the trace context in Rasa Pro.

Questions Tracing Can Help Answer

Tracing can help troubleshoot issues in development and production, by answering questions such as:

  • How does a user message request get processed across different components i.e. dialogue understanding components (NLU, CommandGenerator, CommandProcessorComponent), policies, and action server?
  • Why has my Rasa assistant decided to execute a certain action?
  • Why has my Rasa assistant been slow to respond?
  • Why have my custom actions been slow to execute?
  • What is my OpenAI prompt token usage?
  • What is the performance of my Rasa assistant across different flows?
  • What is the performance of my Rasa assistant across different LLM models?
  • What is the performance of my Rasa assistant across different vector stores?

Configuring a Tracing Backend or Collector

To configure a tracing backend or collector, add a tracing entry to your endpoints i.e. in your endpoints.yml file, or in the relevant section of your Helm values in a deployment.

Jaeger

To configure a Jaeger tracing backend, specify the type as jaeger.

tracing:
type: jaeger
host: localhost
port: 6831
service_name: rasa
sync_export: ~
tip

If you come across the error "OSError: [Errno 40] Message too long", read the instructions here to resolve it

OTEL Collector

Collectors are components that collect traces in a vendor-agnostic way and then forward them to various backends. For example, the OpenTelemetry Collector (OTEL) can collect traces from multiple different components and instrumentation libraries, and then export them to multiple different backends e.g. jaeger.

To configure an OTEL Collector, specify the type as otlp.

tracing:
type: otlp
endpoint: my-otlp-host:4318
insecure: false
service_name: rasa
root_certificates: ./tests/unit/tracing/fixtures/ca.pem

Traced Events

The Rasa service areas that are traceable cover the actions required to:

Model Training

Tracing is enabled for model training by instrumenting Rasa GraphTrainer and GraphNode classes.

GraphTrainer Attributes

The following attributes can be inspected during training of GraphTrainer:

  • training_type of model configuration:
    • "NLU"
    • "CORE"
    • "BOTH"
    • "END-TO-END"
  • language of model configuration
  • recipe_name used in the config.yml file
  • output_filename: the location where the packaged model is saved
  • is_finetuning: boolean argument, if True enables incremental training

GraphNode Attributes

The following attributes are captured during the training (as well as prediction during message handling) of every graph node:

  • node_name
  • component_class
  • fn_name: method of component class that gets called

Message Handling

The following Rasa classes are instrumented to enable tracing during message handling:

In addition, the following Python modules were instrumented to enable tracing during message handling:

  • command processor module, i.e. utility functions leveraged by the CommandProcessorComponent to pre-process predicted commands
  • flow executor module, i.e. utility functions leveraged by FlowPolicy to advance flows

Namely, these operations are now traceable:

Tracing prompt token usage

New in 3.8

Tracing prompt token usage for OpenAI models is available starting with version 3.8.0.

Tracing prompt token usage is available for the following classes if you're using OpenAI models:

  • SingleStepLLMCommandGenerator class
  • MultiStepLLMCommandGenerator class
  • IntentlessPolicy class
  • EnterpriseSearchPolicy class
  • ContextualResponseRephraser class

The prompt token usage is captured as part of the trace context and can be used to monitor the usage of prompt tokens in the LLM answer generation process. This is only captured if one of instrumented classes mentioned above is configured to enable capturing the length of the prompt tokens. For example, the SingleStepLLMCommandGenerator can be configured to trace the length of the prompt tokens by setting the trace_prompt_tokens attribute to true in the config.yml file:

pipeline:
- name: SingleStepLLMCommandGenerator
trace_prompt_tokens: true

It is highly recommended to enable tracing of prompt tokens only in development and not in production, because it could increase assistant response latency.

Agent Attributes

Tracing the Agent instance handling a message captures the following attributes:

  • input_channel: the name of the channel connector
  • sender_id: the conversation id
  • model_id: a unique identifier for the model
  • model_name: the model name

MessageProcessor Attributes

The following MessageProcessor attributes are extracted during the tracing:

  • number_of_events: number of events in tracker
  • action_name: the name of the predicted and executed action
  • sender_id: the conversation id of the DialogueStateTracker object
  • message_id: the unique message id

The latter three attributes are also injected in the trace context that gets passed to the requests made to the custom action server.

TrackerStore & LockStore Attributes

Observable TrackerStore and LockStore attributes include:

  • number_of_streamed_events: number of new events to stream
  • broker_class: the EventBroker on which the new events are published
  • lock_store_class: Name of lock store used to lock conversations while messages are actively processed

SingleStepLLMCommandGenerator Attributes

New in 3.9

Tracing the described SingleStepLLMCommandGenerator attributes is available starting with version 3.9.0.

The following attributes are captured as part of the trace context of the SingleStepLLMCommandGenerator:

  • class_name: the name of the instrumented component class
  • llm_model: the name of the LLM used
  • llm_type: the type of LLM used
  • embeddings: the embeddings used
  • llm_temperature: the temperature used for LLM answer generation
  • request_timeout: the timeout for the LLM request
  • llm_engine: the engine used for LLM answer generation
  • len_prompt_tokens: the token length of the prompt (optional, only supported for OpenAI models). To enable this attribute, see instructions in the Tracing prompt token usage section.

MultiStepLLMCommandGenerator Attributes

New in 3.9

Tracing the described MultiStepLLMCommandGenerator attributes is available starting with version 3.9.0.

The following attributes are captured as part of the trace context of the MultiStepLLMCommandGenerator:

  • class_name: the name of the instrumented component class
  • llm_model: the name of the LLM used
  • llm_type: the type of LLM used
  • embeddings: the embeddings used
  • llm_temperature: the temperature used for LLM answer generation
  • request_timeout: the timeout for the LLM request
  • llm_engine: the engine used for LLM answer generation
  • len_prompt_tokens: the token length of the prompt (optional, only supported for OpenAI models). To enable this attribute, see instructions in the Tracing prompt token usage section.

NLUCommandAdapter Attributes

New in 3.8

Tracing the described NLUCommandAdapter attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the NLUCommandAdapter:

  • commands: the predicted commands
  • intent: the predicted intent of the user message that the NLUCommandAdapter receives as input

Command Processor Module Attributes

New in 3.8

Tracing the described command processor module attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the command processor module functions:

  1. execute_commands function:
    • number_of_events: the number of events in the tracker
    • sender_id: the conversation id of the DialogueStateTracker object
  2. validate_state_of_commands function:
    • cleaned_up_commands: list of cleaned up commands
  3. clean_up_commands function:
    • commands: list of originally parsed commands from the LLM answer
    • current_context: the current context of the dialogue stack
  4. remove_duplicated_set_slots function:
    • resulting_events: list of events prior to removing duplicated set slot events; note that slot values are removed to prevent PII leakage

Flow Executor Module Attributes

New in 3.8

Tracing the described flow executor module attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the flow executor module functions:

  1. advance_flow function:
    • available_actions: list of available actions
    • current_context: the current context of the dialogue stack
  2. advance_flows_until_next_action function:
    • action_name: the name of the action to be executed
    • score: the score of the executed action
    • metadata: the prediction metadata
    • events: list of event names if available
  3. run_step function:
    • step_custom_id: the custom id of the step if available
    • step_description: the description of the step if available
    • current_flow_id: the id of the current flow
    • current_context: the current context of the dialogue stack

Policy subclasses attributes

New in 3.8

Tracing the described Policy subclasses' attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of subclasses of the Policy interface, e.g. FlowPolicy, IntentlessPolicy, EnterpriseSearchPolicy:

  • priority: the priority of the policy which made the prediction
  • events: a list of event names which are applied independent of whether the policy wins against other policies or not
  • optional_events: a list of optional event names if available else None - these events are applied if the policy wins against other policies
  • is_end_to_end_prediction: a boolean indicating if the prediction used the text of the user message instead of the intent
  • is_no_user_prediction: a boolean indicating if the prediction uses neither the text of the user message nor the intent
  • diagnostic_data: intermediate results or other information that is not necessary for Rasa to function, but intended for debugging and fine-tuning purposes
  • action_metadata: additional metadata that can be passed by policies

IntentlessPolicy Attributes

New in 3.8

Tracing the described IntentlessPolicy attributes is available starting with version 3.8.0.

Depending on the instrumented policy method, the following attributes are captured as part of the trace context of the IntentlessPolicy:

  • current_context: the context of the top dialogue stack frame, received as input by the IntentlessPolicy.find_closest_response method
  • ai_response_examples: the sample responses that fit the current conversation, returned by the IntentlessPolicy.select_response_examples method
  • conversation_samples: the conversation samples returned by the IntentlessPolicy.select_few_shot_conversations method
  • ai_responses: the AI responses extracted from the conversation samples by the IntentlessPolicy.extract_ai_responses method
  • llm_response: the response generated by the LLM model call, returned by the IntentlessPolicy.generate_answer method
  • action_name: the name of the action to be executed, received as input by the IntentlessPolicy._prediction_result method
  • score: the score of the executed action, received as input by the IntentlessPolicy._prediction_result method

In addition, the IntentlessPolicy._generate_llm_answer captures the same attributes as the SingleStepLLMCommandGenerator class.

EnterpriseSearchPolicy Attributes

New in 3.8

Tracing the described EnterpriseSearchPolicy attributes is available starting with version 3.8.0.

The EnterpriseSearchPolicy._generate_llm_answer method captures the same attributes as the SingleStepLLMCommandGenerator class.

InformationRetrieval Attributes

New in 3.8

Tracing the described InformationRetrieval subclasses' attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the InformationRetrieval subclasses, e.g. Milvus_Store, Qdrant_Store:

  • query: the query used to search the vector store
  • document_metadata: the metadata of the documents retrieved from the vector store

EndpointConfig Attributes

New in 3.8

Tracing the described EndpointConfig attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the EndpointConfig:

  • url: the url of the endpoint
  • request_body_size_in_bytes: the size of the request body in bytes

Tracing in the Action Server

API Requests are traced as they flow through the action server by instrumenting the webhook that receives custom actions and other classes involved in the execution of custom actions.

New in 3.8

Additional classes are now instrumented to improve tracing in the action server.

The following classes are instrumented;

  • ValidationAction: the base class for custom actions extracting and validating slots that can be set or updated outside a form context.
  • FormValidationAction: the base class for custom actions extracting and validating slots that are set only within the context of a form.
  • ActionExecutor - the class that executes the custom actions.

Webhook Attributes

The following attributes are captured as part of the trace context of the webhook that receives custom actions;

  • http.method: the http method used to make the request
  • http.route: the endpoint of the request
  • next_action: the name of the next action to be executed
  • version: the rasa version used
  • sender_id: the id of the conversation
  • message_id: the unique message id

Action Executor Attributes

The following attributes are captured as part of the trace context of the action executor;

  • action_name: the name of the action to be executed
  • sender_id: the id of the conversation
  • events: a list of returned events
  • slots: a list of filled slots by the executed custom action
  • utters: a list of executed utterances

Slot Validation Action Attributes

The following attributes are captured as part of the trace context of Slot Validation Actions;

  • class_name: the name of the instrumented component class
  • action_name: the name of the action to be executed
  • sender_id: the id of the conversation
  • events: a list of returned events
  • slots: a list of filled slots by the executed custom action
  • utters: a list of executed utterances
  • message_count: the number of messages
  • slots_to_validate: a list of recently filled slots to validate

Debugging custom actions performance

New in 3.8

You can now continue tracing the request further along your custom actions code.

It is now possible to debug the performance of your custom actions by tracing specific parts of your custom actions code. This can be achieved by creating spans to trace the execution of these parts.

In order to create more spans, you can retrieve the tracer object from the ActionExecutorTracerRegister component.

# import the ActionExecutorTracerRegister component
from rasa_sdk.tracing.tracer_register import ActionExecutorTracerRegister

To create a span as documented in the OTEL documentation, corresponding to traces from a specific part of your custom actions code, you can embed the following code snippet in the run method of the custom action:

# retrieve the tracer object
tracer = ActionExecutorTracerRegister().get_tracer()
# create a span
with tracer.start_as_current_span("span_name") as span:
# your code here
span.set_attribute("attribute_name", "attribute_value")

For example, a complete custom action that implements a custom span is shown below:

import requests
import json
from rasa_sdk import Action
from rasa_sdk.tracing.tracer_register import ActionExecutorTracerRegister
class ActionCheckSufficientFunds(Action):
def name(self):
return "action_check_sufficient_funds"
def run(
self,
dispatcher: CollectingDispatcher,
tracker: Tracker,
domain: Dict[Text, Any]
) -> List[Dict[Text, Any]]:
tracer = ActionExecutorTracerRegister().get_tracer()
with tracer.start_as_current_span("span_name"):
balance = 1000 # hardcoded balance from tutorial purposes
transfer_amount = tracker.get_slot("amount")
has_sufficient_funds = transfer_amount <= balance
# set trace attributes
span.set_attribute("has_sufficient_funds", has_sufficient_funds)
return [SlotSet("has_sufficient_funds", has_sufficient_funds)]

Enabling and disabling tracing in the action server is also done in the same way as described below. The same Tracing Backends/Collectors listed above are also supported for the action server. See Configuring a Tracing Backend or Collector for further instructions.

Enabling / Disabling

Tracing is automatically enabled in Rasa Pro by configuring a supported tracing backend. No further action is required to enable tracing.

You can disable tracing by leaving the tracing: configuration key empty in your endpoints file.