Exposing Rasa as an A2A Sub-Agent
Rasa can expose your assistant as a native Agent-to-Agent (A2A) sub-agent.
Overview
In the orchestrator model, Rasa calls external A2A agents from flows. This guide covers the inverse role: your Rasa assistant runs as an A2A sub-agent that an external orchestrator discovers and invokes.
With an a2a_server block in endpoints.yml, rasa run exposes the A2A protocol on the same port as REST and channel webhooks.
The orchestrator fetches your AgentCard, sends user turns over JSON-RPC, and receives structured task lifecycle updates mapped from Rasa's dialogue state.
For full configuration reference, see A2A Server.
Prerequisites
Before enabling A2A server mode:
- Train a CALM assistant with the user-facing flows you want to expose as skills.
- Set session config in
domain.yml—start_session_after_expirymust befalseso resumed orchestrator contexts do not triggeraction_session_startand reset slots.
session_config:
session_expiration_time: 60
start_session_after_expiry: false # required when a2a_server is enabled
- Plan for single-worker deployment — set
SANIC_WORKERS=1until Redis-backed A2A stores ship. Scale with additional replicas and sticky load balancing bycontextIdinstead of multiple Sanic workers per pod.
Step 1 — Add a2a_server to endpoints.yml
Only description is required. Add a public url when orchestrators reach Rasa through a load balancer or ingress rather than localhost.
a2a_server:
url: "http://localhost:5005"
description: "Banking assistant for transfers and appointments"
include_conversation_repair: true
If you use custom actions, include your action_endpoint in the same file as usual.
Step 2 — Start the server
Run Rasa with a single Sanic worker and your trained model:
SANIC_WORKERS=1 rasa run -m models/your-model.tar.gz --endpoints endpoints.yml
To serve HTTPS directly, pass the same rasa run SSL flags used for REST and channels — not an a2a_server.tls block in endpoints.yml:
SANIC_WORKERS=1 rasa run \
--ssl-certificate /certs/server.pem \
--ssl-keyfile /certs/server-key.pem \
-m models/your-model.tar.gz \
--endpoints endpoints.yml
When Rasa sits behind a reverse proxy or ingress, set a2a_server.url to the public https://... base URL orchestrators use. See TLS in the A2A server reference.
Multi-replica deployments
If you run more than one Rasa pod, configure your ingress or load balancer to route all A2A traffic for a given contextId to the same replica. Without this, follow-up turns, messageId deduplication, tasks/cancel, and push callbacks can break because A2A state is in-memory per pod.
Configure consistent hashing (or session affinity) on contextId from the JSON body, the X-A2A-Context-Id header, or an a2a-context-id cookie. For Istio, apply a gateway EnvoyFilter plus a DestinationRule — see Istio on Kubernetes in the A2A server reference.
Full platform guidance: Multi-replica load balancing.
A2A routes are registered at the root of the server:
| Route | Purpose |
|---|---|
POST / | A2A JSON-RPC |
GET /.well-known/agent-card.json | Public AgentCard |
Step 3 — Verify the server is ready
Confirm the model is loaded and the AgentCard is available:
# Model loaded (no auth configured)
curl http://localhost:5005/status
# Model loaded (Rasa auth token configured)
curl -H "Authorization: Bearer <rasa-auth-token>" http://localhost:5005/status
# Capability discovery (no A2A bearer auth configured)
curl http://localhost:5005/.well-known/agent-card.json
# Capability discovery (A2A bearer auth configured via a2a_server.auth)
curl -H "Authorization: Bearer <jwt>" http://localhost:5005/.well-known/agent-card.json
The AgentCard lists your user flows as skills. After the model loads, the card is generated from flow definitions (or loaded from agent_card_path if you configured a static card).
The AgentCard is the primary signal an orchestrator uses when deciding which sub-agent to route a task to. Two descriptions feed into it:
a2a_server.descriptioninendpoints.yml— describes the agent as a whole. Be specific about domain, user population, and what the agent handles (for example: "Banking assistant for retail customers: account transfers, balance enquiries, and appointment booking."). A vague description like "A helpful bot" gives the orchestrator nothing to match against.- Flow
descriptionfields in your YAML flows — each flow becomes a skill in the AgentCard. Write them as concise capability statements from the user's perspective (for example: "Transfer money between the user's accounts" rather than "transfer_money flow"). Orchestrators use skill descriptions to select and invoke the right capability.
Poor descriptions lead to missed routing or incorrect skill selection. Treat them as part of the public API contract.
Step 4 — Point your orchestrator at the endpoint
Configure your orchestrator to use the url from the AgentCard.
The orchestrator should:
- Fetch
GET /.well-known/agent-card.jsonto discover skills and capabilities. - Send user turns via
message/send(blocking) ormessage/stream(SSE streaming). - Reuse the same
contextIdacross turns in a multi-step flow. - Supply a new
messageIdper turn (retries with the samemessageIdreplay the cached result).
Example message/send request
curl -X POST http://localhost:5005/ \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": "req-1",
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "I want to book an appointment"}],
"messageId": "msg-1",
"contextId": "ctx-user-42"
}
}
}'
When a2a_server.auth is configured, add Authorization: Bearer <jwt> to every A2A request:
curl -X POST http://localhost:5005/ \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <jwt>" \
-d '{
"jsonrpc": "2.0",
"id": "req-1",
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "I want to book an appointment"}],
"messageId": "msg-1",
"contextId": "ctx-user-42"
}
}
}'
Step 5 — Read structured results from completed flows
Each orchestrator message creates a new A2A task_id.
Rasa maps dialogue state to A2A task states: working, input_required, completed, failed, canceled, rejected, and auth_required.
Terminal and interactive-terminal responses include:
- A
TextPartwith the user-visible bot utterance (for clients that only readstatus.message). - A
DataPartwith structured state your orchestrator can parse programmatically.
input_required — flow needs more user input
The orchestrator should show the TextPart to the end user and send a follow-up message on the same contextId:
{
"state": "input_required",
"active_flow": "transfer_money",
"slots": {
"account_number": null,
"amount": null
}
}
completed — flow finished
Persisted slot values from flows that declare persisted_slots are returned:
{
"state": "completed",
"slots": {
"appointment_id": "APT-12345"
}
}
Use message/stream when your orchestrator needs working status updates and artifact deltas during streaming custom actions.
Blocking message/send returns only the final task.
Optional — Pass slots from the orchestrator
On each turn, the orchestrator can pre-seed domain slots before Rasa processes the user message. This is useful when the orchestrator already knows context (user ID, account type, etc.) and should not rely on the LLM parsing slot values from free text.
Supply slots via message.metadata:
{
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "Transfer money to savings"}],
"messageId": "msg-2",
"contextId": "ctx-user-42",
"metadata": {
"slots": {
"account_type": "checking"
}
}
}
}
Other supported shapes and precedence rules are documented in Orchestrator slot pre-seeding.
Optional — Secure the endpoint with JWT
For production, configure bearer JWT auth so only trusted orchestrators can call your sub-agent:
a2a_server:
description: "Banking assistant"
auth:
type: bearer
jwt:
algorithm: RS256
public_key_path: "/run/secrets/jwt_public_key.pem"
issuer: "https://auth.example.com"
audience: "rasa-a2a-agent"
See Authentication for full options.
Optional — Enable push notifications
Push notifications are disabled by default. Enable them only when your orchestrator needs HTTP callbacks for task state updates:
a2a_server:
description: "Banking assistant"
push_notifications_enabled: true
push_notification_allowed_hosts:
- "orchestrator.example.com"
The orchestrator registers a pushNotificationConfig.url on message/send, message/stream, or via tasks/pushNotificationConfig/set.
See Push notifications for SSRF safeguards and redirect behaviour.
Next steps
| Topic | Documentation |
|---|---|
All a2a_server keys, defaults, and guardrails | A2A Server reference |
| Architecture, context mapping, and task lifecycle | Rasa as A2A Agent Architecture |
| Rasa as orchestrator calling external agents | Integrating External Agents via A2A |
| Monitoring, log events, and troubleshooting | Monitoring and troubleshooting |
Out of scope
3.17Running Rasa as an A2A sub-agent while also invoking external sub-agents (sub_agents/ with protocol: a2a) is not tested for 3.17.
Use Rasa in one role at a time: either orchestrator or sub-agent.
When in the sub-agent mode, we strongly recommend you to expose all sub-agents to the A2A client/orchestrator rather than chaining them through the Rasa subagent.