Orchestrating A2A and MCP with Rasa

What is Orchestration, Concretely?

A good AI agent hides the complexity of the organization behind it, so everything feels seamless, and the customer never feels they are being handed off between different systems.

If you’re lucky, your use cases are all distinct and all you need is a router. You can build individual agents to handle each domain, and you just need something that sits in front and figures out where to send a user. Conversations never jump between domains. This case is trivial, zeroth-order orchestration.

The hard problems arise when you want to unify different domains into a single, fluent conversation.

In that case, your orchestrator has three roles:

Directing control: "what should happen next?"
Maintaining state and memory: "what do we know so far?"
Framing context: "what does this interaction mean right now?"

A Motivating Example

Imagine a customer-facing AI agent "Ace" for ACME bank. We’re going to give Ace the ability to help users find, buy, and finance a new car - all in one conversation.

ACME bank already offers auto loans and has a well-established process to qualify users and provide them with quotes. To cover the process end-to-end, Ace also needs to:

Research suitable models on the web, and
Locate inventory by talking to a third-party agent (e.g., cars.com) to find cars at nearby dealerships.

Most importantly, Ace has to stitch these skills into one fluent chat so the customer feels "I’m talking to my bank" instead of navigating your org chart.

We’ll use Rasa to build Ace and show how it handles orchestration.

About MCP and A2A

Model Context Protocol (MCP) and Agent2Agent (A2A) are new and still evolving quickly. Here’s a quick breakdown of what they are for:

MCP, introduced by Anthropic, offers a standard way for an AI Agent to connect to APIs. Typically, to use an API, an agent developer has to write "glue" code to:

Format a request
Call the correct endpoint
Extract the required data from the response.

Instead, an MCP server provides information about its API so that an LLM can automatically handle these steps. If you’re familiar with Rasa, you can think of MCP as a "universal" custom action that uses an LLM to adapt to different APIs.

A2A, introduced by Google, provides a standard way for one AI agent to delegate tasks to another running agent. In our example, we want to equip Ace, our banking agent, with a skill (finding cars at local dealerships) offered by a third party agent. A2A allows us to integrate that skill into our agent while treating it as a black box.

This black box approach is necessary when:

The skill is owned by another company (e.g. cars.com) and you don’t control how it’s built.
The skill is implemented using a different technology or is no longer actively maintained.

If you need to integrate a sub-agent someone else built into your 'unified' agent, and you have to treat it as a black box because you have no control over it, then A2A is a reasonable way to do that.

But remember that A2A is solving an organizational problem, not a technical one. Treating sub-agents as black boxes is a very difficult problem, and you should avoid this situation if you can. When you say "this is an A2A problem," you should say it with the same excitement as "this is NP-hard."

If you have the option, model sub-skills as modules, not separate microservices. It’s simpler to run and change. Part 2 of this series discusses the tradeoffs in more detail.

Ace Architecture: Integrating MCP and A2A with Rasa

Here is a sketch of our agent. To help users find and finance a new car, Ace combines three skills:

Web research and recommendation (using MCP) to advise which car model makes most sense for this user.
Searching for cars at local dealerships (using A2A to delegate this to a cars.com agent)
Providing and qualifying users for personalized loan offers, implemented using process calling (a CALM flow)

As a first prototype, we wired MCP and A2A through custom actions. In part 3 of this series we’ll show the native support for MCP and A2A we added in Rasa 3.14, and show how it’s simpler to use and more powerful.

For now, we’re focused on using Rasa to orchestrate the conversation: allowing users to smoothly switch between these topics, with relevant context among each of the skills. This shared context contains:

Structured data stored in slots
The transcript of the conversation so far

Implementing an A2A integration via a custom action

We created a BaseA2AAction base class which handles the specifics of the A2A protocol and its concept of a Task.

It:

Relays user messages to and from the cars.com agent.
Passes structured data (slot names/values) to and from the cars.com agent.
Manages the Task lifecycle (start, resume, complete).
Hands control back to Rasa when the cars.com agent
- returns input_required (it needs to ask the user something), or
- the cars.com agent reports task_completed.

To create our ActionSearchCars action, you specify

where to find the agent card for the cars.com agent
which output fields map to which slots,
and the slots used to track Task state

class ActionSearchCars(BaseA2AAction):
    """Car search action using A2A agent integration."""

    def name(self) -> str:
        return "action_search_cars"

    def get_agent_card_path(self) -> str:
        """Get path to car search agent card file."""
        return "agent_cards/car_search_agent.json"

    def create_slots_from_response(self, response_text: str) -> List[SlotSet]:
        """Parse car response and create Rasa slot updates."""
        try:
            data = json.loads(response_text)
            slots = []
            if isinstance(data, dict) and data.get('has_recommendation'):
                price = data.get('price')
                slots.append(SlotSet("car_price", price))
            return slots
            
        except json.JSONDecodeError as e:
            logger.warning(f"Failed to parse JSON response: {e}")
            return []

    def get_conversation_state_slots(self) -> List[str]:
        """Override to use car-specific slot names."""
        return [
            "car_search_active",
            "car_search_task_id",
            "car_search_context_id"
        ]

‍

Integrating Web Search via MCP

Our web-research skill runs as an MCP sub-agent. Instead of custom glue code, we configure a small contract and let the agent handle the inner loop.

To configure it, we specify:

a system prompt describing the task (research models for this user and justify the picks)
the MCP server(s) and tools to use
the output mapping from the agent’s result to slots
optionally: how completion is detected and any runtime limits

class ActionCarResearch(BaseMCPAgentAction):
    """Car research agent using MCP tools for search and car recommendations."""

    def name(self) -> str:
        return "action_car_research"

    def get_system_prompt(self) -> str:
        """Load system prompt from file."""
        with open("prompts/car_research.jinja2", 'r') as f:
            return f.read().strip()

    def get_mcp_servers(self) -> List[Dict[str, str]]:
        """MCP servers to connect to for car research."""
        return [
            {
                "name": "car_research_server",
                "type": "http",
                "url": "http://localhost:8000"
            }
        ]

    def create_slots_from_response(self, response: Dict[str, any]) -> List[SlotSet]:
        """Create slot updates from car research response."""
        slots = []

        tool_results = response.get("tool_results", {})

        if "recommend_cars" in tool_results:
            recommendations_data = json.loads(tool_results["recommend_cars"])
            recommendations = recommendations_data.get("recommendations", [])
            if recommendations:
                car_models = [rec["model"] for rec in recommendations]
                slots.append(SlotSet("recommended_car_models", car_models))

        return slots

    def get_active_slot_name(self) -> str:
        """Get slot name for tracking car research active state."""
        return "car_research_active"

‍

Understanding the Control Flow

This architecture has two loops:

an outer conversation loop owned by Rasa (waits for user input, runs one or more actions, updates state),
and an inner skill loop that runs inside a single action.

MCP inner loop: Implemented in the MCPAgent base class. The action calls MCP tools until it has a final answer or needs more from the user. It can ask via a user_input tool, or stop when an MCP server returns an elicitation. Control then returns to the flow.

A2A inner loop: Same looping idea, but hidden inside the cars.com agent (treated as a black box). The A2A protocol revolves around Tasks; when the agent reports task_completed or input_required, we break the inner loop and hand control back to Rasa.

In the A2A skill, this loop is implemented by the third party.

Research, shopping, and financing are each wrapped in a CALM flow. The car_research and car_shopping flows just call a single action, pauses if the action needs user input, and resumes when slots are filled. CALM orchestrates across flows using slots and conversation history, so the conversation stays continuous.

flows:
  car_research:
    description: Help the user choose a car to buy by searching the web, 
and answer questions about specific cars as well as generic car-related questions. 
Trigger this proactively if the user isn't clear which car they want.
    steps:
      - action: action_car_research

flows:
  car_shopping:
    description: Find a specific car at a local dealer, once the user knows which model they want.
    steps:
      - action: action_car_shopping

‍

Stress-Testing our Orchestrator

The best way to understand this architecture is to kick the tires on it. In part 3, we’ll share a runnable Ace implemented in Rasa 3.14. For now, here’s what we tested:

Interruptions: Start searching for a car, then ask a tangential research question like, "What's its safety rating?". Watch how the assistant answers the question, then return to the search exactly where you left off.

Context Passing: Ask for a recommendation ("What's a good electric car?") and immediately follow up with "Okay, find one near me." Notice how context from the research skill is passed automatically to the local search skill. Also see how important it is to deliberately discard information when it’s no longer relevant. This is very much a domain design question.

Next Steps

The Ace prototype in this post orchestrates a conversation while integrating two kinds of "external" sub-agents:

The MCP web-research sub-agent runs as a module inside the main Rasa agent.
The cars.com agent runs as a separate service; Rasa talks to it over HTTP via A2A.

In Part 2, we dig into the modules vs. microservices trade-offs.

In Part 3, we’ll publish the Rasa 3.14 example you can run and test.

Attend our upcoming online event

As enterprises scale their assistants, they often face a tough reality: many critical capabilities live in third-party agents or legacy systems that can’t be refactored into one monolithic assistant. This is where Agent-to-Agent (A2A) comes in. Join us for Part 2 of the Rasa Orchestration Series: Agents Talking to Agents on Thursday, October 23, 2025 at 11:00am ET. Click here to register.

Access the replay of our first online event of this series

Click here to watch the replay of our Context Engineering for MCP webinar, which ran on Thursday, September 25, 2025 and featured Rasa CTO & Co-founder Alan Nichol and Staff Product Manager Daksh Varshneya.!