First and foremost: Merry Christmas to you and your families!
It was a wild 2024 and I can only expect that 2025 won’t be that much different. But I hope we will be building better and more robust agents in 2025. For that, I believe multi-agent orchestration to be a key value driver.
Multi-agent orchestration
When orchestrating a workflow between an organization of multiple agents, ensuring a clear way to access and use data is critical to building a robust system. The problem occurs in many real-life (a.k.a., production) scenarios when context needs to be maintained over long-running, multi-agent multi-turn interactions. Naïvely it must be obvious to everyone that you don’t need to embed the whole database every time to prompt context at every execution loop across several data sources.
“Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need.”
I still remember when we built systems that needed to be efficient, lightweight, and easy to debug. Antropic’s MCP is a recently launched protocol that aims to provide a simple and structured way to define, track, and update the context in which a model operates, hoping to ensure that models can access relevant information without exceeding token limits or losing track of prior exchanges.
Here is how it would tie into a traditional ReAct agent framework:
In my opinion, this would be an efficient process that can reduce redundant information across extended sessions especially when considering the concept of multi-modal network or task reasoning as those require at least an order of magnitude more interactions between the LLM and the rest of the agent, as I have explored it in quite some depth throughout this publication.
When Ilya Sutskever mentioned that agentic workflows would be less predictable and that pre-training would hit a natural limit by converging to the same point when trained on the same data because they are all solving similar underlying representational challenges in language understanding and generation, he was right in my opinion. Agents will need to be able to pick up new knowledge on the fly and update their context based on the new knowledge. I still vividly recall the moment when I asked my agent, Matt which number was bigger (9.9 or 9.11), he replied “I will use the calculator tool”, and he provided 9.9 < 9.11 as parameters and concluded the right answer correctly. Although I didn’t expect this at that time, he answered the prompt correctly, even though the way he chose to answer it was unexpected.
Therefore, I think that being able to use tools will substantially augment an agent’s capability to perform a certain task with intent and accuracy thus reducing hallucinations. MCP aims to become the industry standard “Open Tools Protocol”.
Why are tools important for agents?
Even as model capabilities are improving, agents, not unlike us humans, will have the following problems:
If we provide several tools, but there is no deterministic router that ensures a specific query is answered through a specific problem therefore limiting scaling
If we provide one tool per agent, then you have to use one agent per data source limiting scaling, generalization, increasing workflow integrations, and increasing cost.
With every additional data source, another tool implementation is needed thus increasing the problems of (1) and (2).
Transformer Tools
This is an overly simplistic example, but in the transformers framework, you would define a custom tool by importing the tool class from the transformers library and then wrapping it in a function and adding the tool decorator:
from transformers import tool
@tool
def add(a: int, b: int) -> int: """
This is a tool that adds two numbers.
Args:
a,b : A and B i.e., the numbers that should be calculated
"""
return a + b
It’s a straightforward — borderline stupid— example, but it works by explaining that you can do whatever you want within the constraints of the function definition.
The problem usually arises at three different spots within tool selection and routing.
Can the LLM recognize and call the new function at the right priority?
Can the LLM parse the response returned from the tool correctly?
Can the LLM use the tool's response as input to memory?
Tool selection and routing represent critical architectural challenges in developing effective agent systems. In my mind, the ideal agent framework requires a sophisticated mechanism that can dynamically and reliably select and invoke the most appropriate tool based on the nuanced requirements of each specific query or task.
Since this usually necessitates developing intelligent routing algorithms that can assess the contextual complexity of a problem, map it to the most suitable tool or combination of tools, and execute the task with high precision and efficiency, it introduces complexity into our agent architecture that will again make developing PoC’s, testing, and scaling substantially more difficult. And that does not even take into consideration that this system might learn and adapt its tool selection strategies over time. Just consider going forward which components of your agents in your architecture, you would need to adjust given the data changes.
Why is access to this data so important?
Imagine you have stopped learning with the knowledge from when you were 16 years old. You might be able to operate on a reasonably basic level, but might not be able to correctly contextualize the current state of the world. Also, the real value-added agents will provide if they can operate on the newest set of information without re-training the full LLM. Whether that is the most recent accounting data for an Accounting co-pilot, credit bureau data for a credit underwriting agent, or just checking recent news or weather information. Tool use provides an elegant way to source recent data, and current events, or solve simple queries that prompt for a recent weather forecast all within the same interaction.
However, the problem of non-determinism for tool selection persists.
The Anthropic MCP Concept
Well, now there is, at least according to Anthropic a solution. As shown in the image above, I understand MCP as an “Internet for Agents”. MCP is an open-source communication layer between data sources and data consumers whose target audience is practitioners like you and me.
The basic concept of MCP appears simple:
Resources expose data similar to an API “GET” endpoint.
Tools execute procedures similar to an API “POST” request.
Prompts define interaction patterns in the form of reusable templates
source
Maybe in a way MCP is comparable to the Transmission Control Protocol (TCP) that drives our IP=based communication if we break MCP down like this:
Purpose Facilitates communication between Large Language Models (LLMs) and external tools/data.
Communication Model Client-server model using JSON-RPC for structured requests and responses.
Scope Designed for augmenting LLMs with external data or computational tools.
Transport Layer Operates at the application layer, abstracting tool access for LLMs.
Reliability Relies on HTTP or Stdio for transport but focuses on structured, AI-specific interactions.
Flexibility Flexible and extensible, allowing integration with various tools, databases, and APIs.
The main difference to traditional tool implementations is that we can integrate against a standard protocol instead of providing a unique connector for each tool.
This is especially important as the MCP is aimed at maintaining context between the different tools and datasets throughout a work task.
Open Source MCP
One of the best things Anthropic has done is to create a dev-friendly open-source repository of MCP support tools like tutorials, documentation, and community that would help us implement MCP-based solutions. I.e., the repo also consists of pre-built MCP servers for popular enterprise systems like Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer.
So far so good? How can we implement it and what are the components?
source
Standard I/O (studio)
While MCP uses JSON-RPC 2.0 as its standard messaging format, MCP also implements basic Stdio functionality that serves as a foundational interface for managing inputs, outputs, and communication workflows between models and external systems. i..e, through
stdin, MCP can receive model prompts, user inputs, or contextual instructions in a standardized stream, ensuring consistency and flexibility in interactions.
Outputs, including generated completions and model responses, are managed via stdout, enabling easy integration with downstream applications or monitoring tools. Error messages or system alerts are routed through
stderr, ensuring robust debugging and diagnostics.
This is particularly useful for local integrations and command-line tools.
Servers
As shown above, MCP Servers act as intermediaries that expose data resources. Typical examples are flat files, databases, or APIs. Servers accept the client requests and provide access to resources through JSON-RPC thus ensuring secure and efficient communication.
I am wondering if there is even a push notification possible at some point time so inform the agent that inform of changes in logs or database records, this would significantly enhance the responsiveness of AI architectures.
As per the default template, you would setup an MCP server like this:
# server.py
from mcp.server.fastmcp import FastMCP
# Create an MCP server
mcp = FastMCP("Demo")
# Add an addition tool
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
# Add a dynamic greeting resource
@mcp.resource("greeting://{name}")
def get_greeting(name: str) -> str:
"""Get a personalized greeting"""
return f"Hello, {name}!"
Please note the tool definition similar to the setup for a transformer-based tool. If we remember that tool is similar to a “POST” and resource is similar to “GET” it immediately makes sense.
Resources
Now, resources are how you expose your data to the agent. As we assume them to be similar to GET endpoints in a REST API we can follow a similar paradigm when defining them.
@mcp.resource("config://app")
def get_config() -> str:
"""Static configuration data"""
return "App configuration here"
@mcp.resource("users://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
"""Dynamic user data"""
return f"Profile data for user {user_id}"
However, we need to ensure that they provide data but can’t perform significant computation or have side effects:
Tools
Within agent definitions usually we provide the available tools to the agent as an array. The agent can then decide which tool to chose and which action to take. The difference in the context of MCP though is that tools are expected to perform computation and have side effects.
We define tools like this:
@mcp.tool()
def calculate_bmi(weight_kg: float, height_m: float) -> float:
"""Calculate BMI given weight in kg and height in meters"""
return weight_kg / (height_m ** 2)
@mcp.tool()
async def fetch_weather(city: str) -> str:
"""Fetch current weather for a city"""
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.weather.com/{city}")
return response.text
Especially the second one, an API call uses the popular httpx library and adds the complexity of unavailable internet or rate limits to the tool definitions
Prompts
Prompts are defined with the “prompt” decorator and allow for the use of reusable templates that guide your agent LLM brain to interact with your server effectively:
@mcp.prompt()
def review_code(code: str) -> str:
return f"Please review this code:\n\n{code}"
@mcp.prompt()
def debug_error(error: str) -> list[Message]:
return [
UserMessage("I'm seeing this error:"),
UserMessage(error),
AssistantMessage("I'll help debug that. What have you tried so far?")
]
Through that it becomes apparent that MCP appears to have a clean and straightforward service definition paradigm. It’s an interesting thought exercise to think about it as an Internet for Agents and it’s ramifications.
In Closing
What I noticeably found interesting is that, compared to Transformers, MCP is designed with a specific focus on external integration and modularity rather than just model architecture or training. It feels a bit overkill to implement though. While Transformers provide a basic foundational framework for processing tools, MCP extends this by addressing how these models interact with external tools, resources, and data sources in a structured hopefully scalable way.
I think specifically, the predefined available MCP cookbooks will help a lot. Of course, if MCP only works with Anthropic’s LLM it’s not that useful, but how other LLM’s interact with MCP remains to be evaluated. I will keep you posted on this.
Please enjoy the rest of your Christmas break to the max!
“Your user can't help you now, my little program!”
.