Guide

Intermediate Agent Messages with Pydantic AI

July 31, 2025

2025 is shaping up to be the year of agentic AI.

We’re moving beyond simple, single-shot LLM queries and into a world where AI systems can access tools, interact with databases, and perform complex, multi-step tasks to answer our requests. While this is a huge leap forward in capabilities, it also introduces a new set of challenges - particularly when it comes to user experience.

Large Language Models have only been mainstream for a few years, yet users are quickly getting accustomed to their fast response times. But what happens when an AI agent needs to make a series of API calls, query a database, and then reason about the results?

The user is left waiting, possibly staring at a loading spinner, with no idea what’s happening behind the scenes. This “black box” experience can be frustrating and erode trust. To mitigate this, we need to provide a stream of intermediate messages, ensuring the user is updated on the agent’s progress.

This article explores a practical approach to implement this “intermediate-messages” functionality using Pydantic AI, allowing you to build more transparent and user-friendly AI agents.

The Rise of Multi-Agent Systems

Modern AI is increasingly moving towards multi-agent systems.

Instead of a single, monolithic agent trying to do everything, complex tasks are broken down and distributed to a team of specialized agents. For example, one agent might be an expert at searching the web, another at querying a SQL database, and a third at summarizing text. This collaborative approach allows for more sophisticated and robust problem-solving.

Let’s take the task of planning a trip as an example. An agentic system might orchestrate this process as follows:

  • Main Controller: Decomposes the user’s request, “Plan a weekend trip to Paris”, into sub-tasks.
  • Flight Agent: Is tasked with finding the best flight options. It reports back, “I am searching for flights to CDG airport…”
  • Hotel Agent: Concurrently searches for hotels near the city center. It might update with “I am filtering hotels based on your preference for a 4-star rating…”
  • Itinerary Agent: Waits for the outputs from the other agents to assemble a final travel plan.

While this modularity is powerful, it undescores the importance of real-time feedback: users should be able to identify which part of the system is handling their request at any given time.

The User Experience Imperative

Providing status updates is not a new idea: it’s a recognized best practice for building complex AI assistants. All major providers have already implemented solutions for this, since transparency is key to managing user expectations.

When an agent is performing long-running tasks, these updates are essential: they transform a silent, anxious wait into an interactive, confidence-building experience.

Pydantic AI

Pydantic AI has gained significant traction in the AI development community, and for good reason: it brings the rigor of software engineering to the often-unpredictable world of LLMs.

By defining clear input and output schemas, we can ensure that our AI agents are not only powerful but also reliable and easy to maintain. It encourages a “schema-first” design, which means you define the data structures your agent will use before you even start prompting the LLM. This has several advantages:

  • Clarity: Everyone on the team knows exactly what data the agent expects and produces.
  • Testability: You can write unit tests for your agent’s components, just like any other piece of software.
  • Scalability: A well-structured codebase is much easier to expand and adapt over time.

The challenge with Pydantic AI, despite its strengths in schema definition and output parsing, is the lack of a native, built-in mechanism for managing intermediate agent messages.

Implementing Intermediate Messages with Pydantic AI

While developing Connhex Copilot, we came up with a general solution for this problem. The key insight?

Treat status updates not as a special case, but just as another tool that the agent can choose to use.

The agent is already designed to call tools: we simply need to provide it with one specifically for communicating its progress. Let’s walk through the implementation step-by-step.

Step 1: Define a Schema for Status Updates

First, we define a Pydantic model for our status messages. This ensures that every update is structured and predictable.

class CurrentStatusUpdate(BaseModel):
    status: str = Field(
        description="Short description of the current activity the agent is performing.")

Step 2: Create the Tools

Next, we define the functions our agent can use. This includes some utility example tools (get_weather, get_temperature) and our special status update tool.

The emit_status_update function is simple: it takes the CurrentStatusUpdate object and returns the status string. This function becomes the tool that the agent calls to send a message back to the user before it has the final answer.

def get_weather(location: str) -> str:
    """
    Get the current weather in a given location.
    """

    return "The weather in " + location + " is sunny."


def get_temperature(location: str) -> str:
    """
    Get the current temperature in a given location.
    """

    return "The temperature in " + location + " is 25°C."


def emit_status_update(update: CurrentStatusUpdate) -> str:
    """
    Emit the current status update as a string.

    Args:
        update (CurrentStatusUpdate): The current status update object.

    Returns:
        str: The status description.
    """

    return update.status

Step 3: Create and Instruct the Agent

This is where everything comes together. We initialize our Agent, giving it the tools we created and, most importantly, a clear instruction to use the emit_status_update tool.

Then, we iterate over the agent’s graph using agent.iter(). This asynchronous iterator lets us inspect the agent’s process step-by-step.

async def main():
    agent = Agent[None, str](
        model="openai:gpt-4.1-mini",
        instructions="You're a weather assistant. Use the `emist_status_update` tool to send the current status of the processing to the user.",
        output_type=str,
    )
    user_prompt = "I'd like to know what the weather and temperature are like in New York."
    status_updates: list[str] = []
    run_result = ""

    async with agent.iter(user_prompt) as run:
        async for node in run:
            if Agent.is_call_tools_node(node):
                async with node.stream(run.ctx) as handle_stream:
                    async for event in handle_stream:
                        if isinstance(event, FunctionToolResultEvent):
                            tool_name = event.result.tool_name
                            print("Model is calling tool:", tool_name)
                            if tool_name == "emit_status_update":
                                status_updates.append(event.result.content)
            if Agent.is_end_node(node):
                run_result = run.result.output

    print("Status updates:", status_updates)
    print("Model final response:", run_result)

Putting It All Together

You can find the complete, runnable script in this Gist. When executing it, you get a clear separation between the intermediate progress updates and the final, complete answer.

Here is an example of the output:

Model is calling tool: emit_status_update
Model is calling tool: get_weather
Model is calling tool: get_temperature
Status updates: ['Fetching current weather and temperature for New York']
Model final response: The weather in New York is currently sunny, and the temperature is 25°C. If you need any more details or updates, feel free to ask!

Conclusion

The approach detailed in this article is a simple yet powerful pattern for solving this “intermediate-messages” problem within the Pydantic AI framework.

However, it’s important to view this as a foundational starting point. In our example, we simply collected the status updates into a Python list. In a real-world application, the crucial next step is to get these messages to the end-user in real time. When your application intercepts a call to emit_status_update, it should immediately push that message to the client - separated from the final answer. This can be implemented in multiple ways, for example using WebSockets or Server-Sent Events (SSE).

By decoupling the intermediate feedback from the final result, you can transform a frustrating wait into an engaging and transparent interaction.

👋 Hi there! We are Compiuta, an Italian sofware company focused on building products at the intersection between Industrial IoT and AI - our greatest hit so far is Connhex.

This is our blog, where we share updates on our products, things we've learnt along the way and stories about the journey that building a company from scratch is.

I accept Compiuta's Privacy Policy