Technical

How AI Coding Agents Communicate: MCP, Tool Calling, and the 97M Download Standard

MCP hit 97M monthly SDK downloads. 10,000+ active servers. OpenAI, Google, Microsoft all adopted it. Here's how agent communication actually works in 2026.

Published2025-12-28

Read10 min

ByTejas Shah/ Founder

The Model Context Protocol (MCP) went from Anthropic's internal standard to 97M+ monthly SDK downloads in 14 months. 10,000+ active public servers. OpenAI, Google, Microsoft, and AWS all adopted it. In November 2025, Anthropic donated MCP to the Linux Foundation.

This is the infrastructure layer for AI agent communication in 2026. Here's how it works and why it matters.

The Communication Problem

A coding agent needs to:

Read files from your codebase
Search across thousands of files
Execute shell commands
Edit code in place
Coordinate with other agents
Report status to users

Each requires a different communication pattern. Get it wrong, and your agent either can't function or burns through context window and tokens.

MCP by the Numbers

The Model Context Protocol reached critical mass in 2025:

Metric	November 2024	November 2025	Growth
SDK Downloads (monthly)	~100K	97M+	970x
MCP Servers	~10	10,000+	1,000x
Registry Entries	0	~2,000	—
Discord Contributors	0	2,900+	—
New Contributors/Week	0	100+	—

Who Adopted MCP

OpenAI: ChatGPT desktop app, March 2025
Google DeepMind: Gemini integration
Microsoft: Copilot, VS Code extensions
AWS, Google Cloud, Cloudflare: Enterprise deployment support
Cursor, Claude Desktop, JetBrains: Native IDE support

In November 2025, Anthropic donated MCP to the Agentic AI Foundation (AAIF) under the Linux Foundation, co-founded with Block and OpenAI.

Pattern 1: Tool Calling (Function Calling)

The foundational pattern. The LLM decides to use a tool and structures its request according to a schema.

LLM → "Call read_file with path=/src/auth.ts" → Tool → Result → LLM

Berkeley Function Calling Leaderboard (BFCL)

The de facto standard for measuring tool calling accuracy:

Model	BFCL Score	Notes
GLM-4.5 (FC)	70.85%	Top performer
Claude Opus 4.1	70.36%	Close second
Claude Sonnet 4	70.29%	Best cost/performance
GPT-5	59.22%	Struggles on BFCL
Qwen-3-Coder	~65%	Best open-weight

Tool Calling Latency: The Docker Study

Docker tested 21 models across 3,570 test cases. The tradeoffs are clear:

| Model | F1 Score | Avg Latency | |-------|----------|-------------| | GPT-4 | 0.974 | ~5 seconds | | Claude 3 Haiku | 0.933 | 3.56 seconds | | Qwen 3 14B (local) | 0.971 | 142 seconds | | Qwen 3 8B (local) | 0.933 | 84 seconds |

The tradeoff: Reasoning = latency. Claude 3 Haiku offers the best balance for latency-sensitive applications: 0.933 F1 in 3.56 seconds.

Implementation

Anthropic Tool Use

from anthropic import Anthropic

client = Anthropic()

tools = [{
    "name": "read_file",
    "description": "Read contents of a file",
    "input_schema": {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "File path to read"}
        },
        "required": ["path"]
    }
}]

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    tools=tools,
    messages=[{"role": "user", "content": "Read the auth.ts file"}]
)

OpenAI Function Calling

from openai import OpenAI

client = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "read_file",
        "parameters": {
            "type": "object",
            "properties": {
                "path": {"type": "string"}
            },
            "required": ["path"]
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-4o",
    tools=tools,
    messages=[{"role": "user", "content": "Read the auth.ts file"}]
)

Pattern 2: Model Context Protocol (MCP)

MCP is the universal standard for agent-to-tool communication. Instead of each tool having custom integration, MCP provides a protocol that works across all agents.

The Problem MCP Solves

Without MCP:

Cursor    → Custom GitHub integration
Windsurf  → Custom GitHub integration
Copilot   → Custom GitHub integration
Claude    → Custom GitHub integration

With MCP:

Any Agent → MCP → GitHub MCP Server

One integration serves all agents.

MCP Architecture

MCP defines four primitives:

Resources: Things agents can read (files, databases, APIs)
Tools: Actions agents can take
Prompts: Templates for common interactions
Sampling: Requesting LLM completions

Agent ←→ MCP Client ←→ MCP Server ←→ Resources/Tools

MCP Server Example

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = new Server({
  name: "filesystem",
  version: "1.0.0"
}, {
  capabilities: {
    resources: {},
    tools: {}
  }
});

// Expose file reading
server.setRequestHandler("resources/read", async (request) => {
  const content = await fs.readFile(request.params.uri, "utf-8");
  return {
    contents: [{
      uri: request.params.uri,
      mimeType: "text/plain",
      text: content
    }]
  };
});

// Expose file search tool
server.setRequestHandler("tools/list", async () => {
  return {
    tools: [{
      name: "search_files",
      description: "Search for files matching a pattern",
      inputSchema: {
        type: "object",
        properties: {
          pattern: { type: "string" }
        }
      }
    }]
  };
});

const transport = new StdioServerTransport();
await server.connect(transport);

Enterprise MCP Deployment

Major cloud providers now support MCP deployment:

AWS: Lambda-based MCP servers
Google Cloud: Cloud Run MCP hosting
Cloudflare: Workers-based MCP servers
Azure: Container Apps integration

MCP Monthly DownloadsPython + TypeScript SDKs

97M+

Active Servers10,000+M+

Registry Entries~2,000M+

Weekly Contributors100+M+

Pattern 3: Inter-Agent Messaging

When multiple agents collaborate, they need to share information. Three main approaches:

Shared State

All agents read and write to a common state object:

from dataclasses import dataclass, field
from typing import Dict, List, Any

@dataclass
class SharedState:
    files_modified: List[str] = field(default_factory=list)
    current_task: str = ""
    context: Dict[str, Any] = field(default_factory=dict)
    errors: List[str] = field(default_factory=list)

state = SharedState()

# Agent A writes
state.files_modified.append("auth.ts")
state.context["auth_module"] = "validated"

# Agent B reads
if "auth.ts" in state.files_modified:
    run_tests("auth")

Best for: Tightly coupled agents with clear handoffs. LangGraph uses this pattern.

Message Passing

Agents send discrete messages to each other:

from typing import Callable
from dataclasses import dataclass

@dataclass
class Message:
    sender: str
    type: str
    payload: Any

class MessageBus:
    def __init__(self):
        self.handlers: Dict[str, List[Callable]] = {}

    def subscribe(self, message_type: str, handler: Callable):
        self.handlers.setdefault(message_type, []).append(handler)

    def publish(self, message: Message):
        for handler in self.handlers.get(message.type, []):
            handler(message)

bus = MessageBus()

# Agent A sends
bus.publish(Message(
    sender="coding_agent",
    type="files_changed",
    payload=["auth.ts", "user.ts"]
))

# Agent B receives
@bus.subscribe("files_changed")
def handle_changes(message: Message):
    run_tests(message.payload)

Best for: Loosely coupled agents, event-driven workflows. AutoGen uses this pattern.

Blackboard Pattern

A shared "blackboard" where agents post and read information:

class Blackboard:
    def __init__(self):
        self.entries: Dict[str, List[Any]] = {}

    def post(self, category: str, content: Any):
        self.entries.setdefault(category, []).append(content)

    def read_all(self, category: str) -> List[Any]:
        return self.entries.get(category, [])

    def read_latest(self, category: str) -> Any:
        entries = self.entries.get(category, [])
        return entries[-1] if entries else None

blackboard = Blackboard()

# Any agent can write
blackboard.post("hypothesis", "The bug is in the auth module")
blackboard.post("evidence", "Stack trace points to line 42")

# Any agent can read
hypotheses = blackboard.read_all("hypothesis")
latest_evidence = blackboard.read_latest("evidence")

Best for: Agents that don't know about each other but contribute to shared goals.

Pattern	Coupling	Complexity	Best For
Shared State	Tight	Low	Sequential workflows (LangGraph)
Message Passing	Loose	Medium	Event-driven (AutoGen)
Blackboard	None	High	Emergent collaboration

Pattern 4: Human-in-the-Loop

Agents need to communicate with humans at decision points. Google's architecture guidance recommends this for high-stakes operations.

Approval Requests

async def request_approval(action: str, context: dict) -> bool:
    message = f"""
    Action: {action}
    Files affected: {context.get('files', [])}
    Changes: {context.get('summary', 'N/A')}

    Approve? [y/n]
    """
    response = await user_input(message)
    return response.lower() in ['y', 'yes']

Progress Updates

def report_progress(status: str, completed: int, total: int):
    emit_event("progress", {
        "status": status,
        "completed": completed,
        "total": total,
        "percentage": (completed / total) * 100
    })

Error Escalation

async def escalate_error(error: Exception, context: dict) -> str:
    message = f"""
    Error: {error}
    Context: {context.get('task', 'Unknown task')}

    Options:
    1. Retry with different approach
    2. Skip this step
    3. Abort task

    Choice?
    """
    return await user_input(message)

Optimizing Communication

Batching Tool Calls

Instead of sequential calls:

read_file(a.ts) → wait → read_file(b.ts) → wait → read_file(c.ts)

Batch them:

import asyncio

async def batch_read(paths: list[str]) -> list[str]:
    tasks = [read_file(path) for path in paths]
    return await asyncio.gather(*tasks)

# Single wait for all results
results = await batch_read(["a.ts", "b.ts", "c.ts"])

Speculative Execution

Predict likely next actions and pre-fetch:

def speculative_prefetch(context: dict):
    # User is editing auth.ts
    likely_next = predict_next_tools(context)
    # ["read_file:user.ts", "search:validateToken"]

    # Pre-fetch in background
    for prediction in likely_next:
        asyncio.create_task(prefetch(prediction))

# When user actually requests, results are cached

Context Compression

Summarize large tool results before sending to LLM:

def compress_search_results(results: list[dict], max_results: int = 10):
    if len(results) <= max_results:
        return results

    return {
        "total_matches": len(results),
        "top_results": results[:max_results],
        "summary": generate_summary(results),
        "truncated": True
    }

The 2026 Stack

What's Winning

MCP for integrations: One protocol, all agents
Native tool calling: Claude, GPT, Gemini all support structured function calls
LangGraph for orchestration: State-based workflows dominate
Langfuse for observability: Tracing across agent communication

What's Coming

The A2A (Agent-to-Agent) standard backed by Salesforce and Google hints at the next evolution: agents communicating across organizational boundaries.

Public MCP servers for common services
Agent marketplaces where specialized agents offer services
Federated protocols for cross-organization agent collaboration

The Morphcode Approach

We've optimized communication patterns specifically for code editing:

Direct Tool Integration

No protocol overhead for core operations:

# Not this (generic tool call with MCP overhead)
await mcp_client.call_tool("edit_file", {"path": p, "content": c})

# This (direct integration)
fast_apply(path, content)  # Optimized path, no protocol layers

Parallel by Default

Every file operation runs in parallel unless there's a dependency:

# Automatically parallelized
edits = [
    edit("auth.ts", change1),
    edit("user.ts", change2),
    edit("api.ts", change3)
]
await asyncio.gather(*edits)  # Single round trip

Minimal Context Transfer

We don't send entire files when diffs suffice:

# Not this
send_to_llm(entire_file_content)  # Wastes tokens

# This
send_to_llm(relevant_snippet + diff_context)  # Minimal tokens

This is how we achieve 10,500 tok/s—efficient communication at every layer.

Communication Optimized for Speed

Morphcode's communication layer is built for speed. No protocol overhead. 10,500 tok/s code editing.

Get Started

Conclusion

Effective agent communication in 2026 means:

MCP for integrations: 97M+ monthly downloads, industry-wide adoption
Tool calling with benchmarks: Know your model's F1 score and latency
Right pattern for coupling: Shared state, message passing, or blackboard
Human-in-the-loop for decisions: Never hide failures

The protocols solidifying now—MCP, standardized tool calling, inter-agent messaging—are the infrastructure layer for the next decade of AI development.

Sources: MCP Anniversary Report (November 2025), Berkeley Function Calling Leaderboard, Docker Tool Calling Study (21 models, 3,570 test cases), Anthropic MCP Documentation.