Skip to main content

Rasa as A2A Agent Architecture

When a2a_server is configured in endpoints.yml, Rasa registers A2A JSON-RPC routes on the same Sanic port as REST and channel webhooks. An external orchestrator discovers capabilities via AgentCard, sends user turns, and receives task lifecycle updates mapped from Rasa's dialogue state.

This page describes the sub-agent (server) architecture. For Rasa as an orchestrator that calls external A2A agents, see Integrating External Agents via A2A.

For configuration and operations, see A2A Server and Exposing Rasa as an A2A Sub-Agent.

Deployment topology

A2A sub-agent deployment topology

Shaded regions show ownership: the orchestrator (A2A client) discovers skills, sends and receives turns over JSON-RPC, and optionally hosts a push callback URL. Rasa (A2A server) exposes public endpoints, runs flows, and POSTs task updates when push notifications are enabled (dashed arrow).

The orchestrator owns end-user routing and multi-agent coordination. Rasa owns flow execution, slot collection, conversation repair, and NLG within its advertised skills.

The governed agent contract

Rasa exposes a bounded contract to orchestrators. Internal dialogue machinery stays private.

What Rasa exposes

SurfaceDescription
AgentCardPublic capability advertisement. User flows become AgentSkill entries; pattern flows are included when include_conversation_repair: true.
A2A task statesworking, input_required, completed, failed, canceled, rejected, auth_required — mapped from dialogue stack and tracker state after each turn.
Structured DataPartTerminal and interactive-terminal payloads with state, active_flow, current slots, persisted slot values, and error/cancel metadata when applicable.
User-visible TextPartBot utterance in status.message so clients that only read text still receive NLG.
message/stream artifactsworking status updates and token/chunk deltas during streaming custom actions.
Session continuityOne Rasa conversation per orchestrator contextId; contexts stay resumable across turns and after inactivity (with start_session_after_expiry: false).
Orchestrator authOptional bearer JWT on JSON-RPC and AgentCard routes (validates the orchestrator, not the end user).

What Rasa keeps internal

InternalWhy it stays private
LLM prompts and command-generator internalsImplementation detail; orchestrator sends user text, receives commands' effects as task state.
Dialogue stack frames (pattern_*, flow stack)Mapped to A2A states and summarized slots — raw stack is not exported.
Full tracker event historyAvailable only via separate REST/tracker APIs if configured; not part of the A2A task contract.
Channel connectors and REST webhook surfaceParallel ingress paths; orchestrator uses A2A JSON-RPC only.
Sub-agent orchestration (sub_agents/)Separate concern; not part of the sub-agent server contract.

End-user identity and backend context belong in A2A message.metadata, DataPart slot pre-seeding, or orchestrator-side session management — not in the sub-agent JWT.

Context and task mapping

A2A context and task mapping

Blue: identifiers the orchestrator supplies on each A2A message. Purple: how Rasa maps and tracks them server-side.

A2A conceptRasa mappingBehaviour
contextIdsender_idOne persistent Rasa conversation per orchestrator context. Slots and flow progress carry over across turns.
messageIdDedup key (contextId, messageId)Retries replay the cached terminal task. Concurrent duplicates while in flight return HTTP 409.
task_idNew per orchestrator messageEach turn is a distinct A2A task. A completed task on a context does not block the next message — a new task starts.
input_requiredContext reservationThe context stays reserved until the orchestrator sends a follow-up or calls tasks/cancel.

ConversationInactive and tracker inactivity do not release an input_required context. With start_session_after_expiry: false (required for A2A), the next message on the same contextId resumes the flow without running action_session_start.

Task state lifecycle

After each message is processed, TaskStateMapper derives the A2A task state from the dialogue stack and latest action. Evaluation follows this priority order:

  1. auth_required — orchestrator bearer token missing or invalid
  2. rejectedCannotHandlePatternFlowStackFrame active (outside jurisdiction)
  3. failedInternalErrorPatternFlowStackFrame active
  4. canceledCancelPatternFlowStackFrame active or orchestrator tasks/cancel
  5. completed — user-facing flow emitted FlowCompleted this turn; no user flows remain on the stack; bot waits for user input (even when pattern_completed is active at action_listen)
  6. input_requiredCompletedPatternFlowStackFrame active; no user flows on stack; include_conversation_repair is true; bot waits for user input; no user flow completed this turn
  7. completedCompletedPatternFlowStackFrame active and no user flows on stack (including when conversation repair is disabled)
  8. input_required — last action waits for user input (collect step, greet, etc.)
  9. working — otherwise (processing or mid-turn streaming)

A2A task state lifecycle

State reference

StateMeaning for orchestratorTypical next action
workingTurn in progress; message/stream may emit intermediate updatesWait for terminal state
input_requiredFlow needs more user input; DataPart includes active_flow and current slotsShow TextPart to user; send follow-up on same contextId
completedBusiness flow finished; DataPart includes persisted_slotsRead structured output; may start a new task on same contextId
rejectedRequest outside advertised skills (reason in DataPart)Route to a different agent or handle in orchestrator
failedInternal error (error_type, error_info in DataPart)Retry, escalate, or fall back
canceledOrchestrator or timeout cancelled the taskClean up orchestrator-side state
auth_requiredMissing or invalid orchestrator JWTRefresh token and retry

When include_conversation_repair: false, conversation-repair stack frames map to completed instead of input_required.

Message processing path

A2A message processing path

Blue groups: orchestrator-side request, response, and optional push callback. Purple group: Rasa server processing from JSON-RPC ingress through CALM to task status and push transport.

On each turn:

  1. Optional bearer JWT is validated (orchestrator auth).
  2. Orchestrator slot pre-seeding is applied as SetSlotCommands before dialogue processing.
  3. Rasa executes flows via the standard CALM pipeline.
  4. A2AOutputChannel maps tracker state to A2A task updates, including structured DataPart and TextPart.
  5. Optional push notification POSTs fire on state transitions when enabled.

Scaling constraints (in v3.17)

Until persistent A2A task and message stores ship:

  • SANIC_WORKERS=1 per replica — idempotency, cancel, and max_contexts are per-worker.
  • Horizontal scaling — add replicas with load balancer sticky routing keyed on contextId so the same orchestrator context hits the same pod. A2A task caches, deduplication, in-flight queues, and push config are in-memory per replica until persistent stores ship. Derive the routing key from the JSON body contextId, the X-A2A-Context-Id header, or an a2a-context-id cookie set by the ingress. See Multi-replica load balancing.
  • Hot model reloadPUT /model does not refresh the AgentCard or re-wire the A2A executor; restart after model deploys that change advertised skills.
Not tested for 3.17

Running Rasa as an A2A sub-agent while also invoking external sub-agents (sub_agents/ with protocol: a2a) is not tested for v3.17.