Agent Communication Protocols: MCP, A2A, and the Emerging Interoperability Stack for Multi-Agent Systems
Whitepaper 06 | BlueFly.io Agent Platform Series Version: 1.0 Date: February 2026 Status: Published Authors: BlueFly.io Agent Platform Research Division Classification: Public
Abstract
The multi-agent systems landscape faces a fundamental challenge: how do autonomous agents discover, communicate with, and orchestrate work across heterogeneous environments, diverse tool ecosystems, and independently developed platforms? Three protocols have emerged as the dominant answers to distinct layers of this problem. The Model Context Protocol (MCP), introduced by Anthropic in late 2024, standardizes how agents connect to tools and data sources, accumulating over 97 million npm downloads and establishing a de facto standard for agent-to-tool communication. Google's Agent-to-Agent (A2A) protocol, launched in April 2025 with endorsement from over 100 organizations, addresses the peer-to-peer communication gap that MCP intentionally left open, enabling agents built on different frameworks to discover each other and collaborate on complex tasks. The Open Standard for Sustainable Agents (OSSA), developed by BlueFly.io, provides the orchestration and governance layer that sits above both, managing agent lifecycles, task planning, and policy enforcement.
This whitepaper provides a rigorous technical analysis of all three protocols, examining their architectures, message formats, transport mechanisms, security models, and performance characteristics. We present detailed data flow diagrams, latency models, and deployment patterns for production environments. We introduce a three-layer interoperability stack---MCP for tools, A2A for communication, OSSA for orchestration---and demonstrate how protocol bridges enable seamless translation between layers. Through benchmarks conducted on the BlueFly.io Agent Platform, we show that the combined stack adds less than 15ms of overhead per inter-agent message while providing full audit trails, cryptographic authentication, and governance compliance. We conclude with an analysis of the protocol convergence trajectory, the role of the AI Agent Interoperability Forum (AAIF), and architectural recommendations for organizations building multi-agent systems at scale.
Keywords: Model Context Protocol, Agent-to-Agent Protocol, OSSA, multi-agent systems, interoperability, protocol bridges, agent communication, MCP, A2A, JSON-RPC, agent orchestration
1. Protocol Landscape: Fragmentation and Convergence
1.1 Historical Context: Thirty Years of Agent Communication
The quest to standardize agent communication is nearly as old as the field of artificial intelligence itself. The Foundation for Intelligent Physical Agents (FIPA) published its Agent Communication Language (ACL) specification in 1997, drawing on speech act theory to define performatives---inform, request, propose, confirm---that agents could use to express communicative intent. FIPA ACL was theoretically elegant but practically cumbersome: its reliance on content languages like KIF (Knowledge Interchange Format) and SL (Semantic Language) created a barrier to adoption that the web services revolution would soon render insurmountable.
The early 2000s saw SOAP and WSDL emerge as the dominant integration paradigm. Agents became "web services," discoverable through UDDI registries and communicable through XML-encoded envelopes. The approach was heavyweight but standardized, and it worked well enough for enterprise integration. REST emerged as a lighter-weight alternative, trading formal contracts for convention-based resource manipulation. gRPC, introduced by Google in 2015, brought Protocol Buffers, HTTP/2 multiplexing, and bidirectional streaming to the table, becoming the preferred choice for high-performance microservice communication.
Timeline: Agent Communication Protocol Evolution
1997 ----[FIPA ACL]---- Speech acts, KIF content language
2000 ----[SOAP/WSDL]--- XML envelopes, UDDI discovery
2005 ----[REST/JSON]---- Resource-oriented, convention-based
2015 ----[gRPC]--------- Protobuf, HTTP/2, bidirectional streaming
2024 ----[MCP]---------- Agent-to-tool, JSON-RPC 2.0
2025 ----[A2A]---------- Agent-to-agent, Agent Cards, task lifecycle
2025 ----[OSSA]--------- Orchestration, governance, lifecycle management
2025 ----[AAIF]--------- Industry forum: OpenAI, Anthropic, Google, et al.
Figure 1.1: Timeline of agent communication protocol evolution from FIPA ACL (1997) through the modern MCP/A2A/OSSA stack (2025).
Each generation solved problems the previous one left open, but none addressed the specific requirements of AI agent ecosystems: dynamic capability discovery, streaming inference results, multi-turn task management, and governance-aware orchestration. MCP, A2A, and OSSA represent the first protocols designed from the ground up for the age of large language models and autonomous agents.
1.2 The Fragmentation Problem
The scale of fragmentation in 2025 is staggering. Consider an enterprise deploying agents built on five different frameworks---LangChain, CrewAI, AutoGen, Semantic Kernel, and a custom framework. Each agent needs to access tools (databases, APIs, file systems), communicate with peer agents, and report status to orchestration layers. Without standardized protocols, the number of custom integrations required follows the combinatorial formula:
$$C(n) = \frac{n(n-1)}{2}$$
For five frameworks, this yields 10 unique integration pairs. For twenty frameworks---a realistic count for a large enterprise with acquired subsidiaries---it yields 190. Each integration must handle authentication, message serialization, error propagation, and state management. The maintenance burden becomes untenable.
Table 1.1: Integration Complexity Without Standards
| Frameworks (n) | Unique Pairs C(n) | Estimated Dev Hours | Annual Maintenance Cost |
|---|---|---|---|
| 3 | 3 | 240 | $48,000 |
| 5 | 10 | 800 | $160,000 |
| 10 | 45 | 3,600 | $720,000 |
| 20 | 190 | 15,200 | $3,040,000 |
| 50 | 1,225 | 98,000 | $19,600,000 |
The economic argument for standardization is overwhelming. With standardized protocols, each framework implements a single adapter for each protocol layer, reducing the integration count from O(n^2) to O(n).
1.3 The Convergence Moment
Three events in late 2024 and 2025 catalyzed convergence. First, Anthropic open-sourced MCP in November 2024, and the protocol achieved explosive adoption: 97 million npm downloads, integration into Claude Desktop, VS Code, Cursor, Windsurf, and dozens of other hosts. The network effect was self-reinforcing---more hosts meant more servers were built, which attracted more hosts.
Second, Google launched A2A in April 2025 with backing from over 100 organizations, including Salesforce, SAP, Atlassian, MongoDB, Accenture, and Deloitte. A2A explicitly positioned itself as complementary to MCP: where MCP connects agents to tools, A2A connects agents to each other.
Third, the AI Agent Interoperability Forum (AAIF) was formed in December 2025, bringing together OpenAI, Anthropic, Google, Microsoft, Amazon Web Services, Bloomberg, and Cloudflare. The forum's stated mission is to "establish open standards for AI agent interoperability," and its formation signaled that the major AI companies had concluded that proprietary lock-in was less valuable than ecosystem growth.
1.4 Adoption Metrics and Network Effects
The adoption numbers tell a compelling story of protocol-market fit:
- MCP: 97M+ npm downloads, 12,000+ community-built servers, integrated into 15+ major AI platforms
- A2A: 100+ endorsing organizations, 5,000+ GitHub stars within first month, reference implementations in Python, Java, Go, TypeScript
- OSSA (AGENTS.md): 60,000+ repositories containing AGENTS.md files, emerging as a discovery layer for agent capabilities
The network effect dynamics follow Metcalfe's Law, where the value of the network is proportional to the square of the number of connected nodes:
$$V(n) = k \cdot n^2$$
where V is network value, n is the number of participating nodes (agents, tools, or services), and k is a proportionality constant reflecting the average value of each connection. For MCP, each new server increases the value proposition for every host, and each new host increases the value proposition for every server. This creates a virtuous cycle that has driven adoption far faster than any previous agent communication standard.
1.5 The "VHS vs Betamax" Question
Industry observers have drawn parallels to the VHS/Betamax format war, asking whether MCP and A2A will compete until one wins. This framing is incorrect. MCP and A2A address different layers of the communication stack, just as HTTP and TCP operate at different OSI layers. The correct analogy is not VHS vs. Betamax but rather USB (MCP, connecting peripherals) and Wi-Fi (A2A, connecting peers). Both are necessary; neither substitutes for the other.
The real competition, if any, is between protocol stacks: the MCP+A2A+OSSA stack versus proprietary alternatives from individual vendors. The AAIF's formation suggests that even the major vendors have concluded that open standards serve their interests better than fragmentation.
2. MCP Deep Dive: Agent-to-Tool Communication
2.1 Architecture Overview
The Model Context Protocol defines a client-server architecture with three distinct roles:
- Host: The application that initiates the connection (Claude Desktop, VS Code, an IDE, or a custom application). The host contains the LLM and manages user interaction.
- Client: A protocol-level component within the host that maintains a 1:1 connection with a single MCP server. A host may contain multiple clients.
- Server: A lightweight process that exposes capabilities (tools, resources, prompts) to clients. Servers connect to external systems---databases, APIs, file systems, cloud services.
+---------------------------------------------------+
| MCP HOST |
| |
| +--------+ +----------+ +----------+ |
| | LLM | | Client 1 | | Client 2 | ... |
| +--------+ +----+-----+ +----+-----+ |
| | | |
+---------------------------------------------------+
| |
+------+------+ +----+------+
| MCP Server | | MCP Server|
| (GitHub) | | (Slack) |
+------+------+ +-----+-----+
| |
+------+------+ +----+------+
| GitHub API | | Slack API |
+-------------+ +-----------+
Figure 2.1: MCP Host-Client-Server architecture showing one host with an LLM, multiple clients, and their corresponding servers connecting to external services.
2.2 Transport Layer
MCP defines three transport mechanisms, each optimized for different deployment scenarios:
stdio (Standard I/O): The simplest transport, using stdin/stdout for communication between a host process and a server subprocess. The host spawns the server as a child process and communicates via piped streams. This transport is ideal for local development and desktop applications (e.g., Claude Desktop). Latency is minimal---typically under 1ms per message---because no network stack is involved.
SSE (Server-Sent Events): An HTTP-based transport where the client opens a long-lived connection to the server. The server pushes events over this connection, while the client sends requests via standard HTTP POST. SSE is suitable for remote servers and web-based hosts. Latency is typically 3-8ms per message, depending on network conditions.
HTTP Streamable: The newest transport, introduced in the MCP specification update of early 2025. It uses standard HTTP requests with streaming response bodies, allowing servers to stream partial results as they become available. This transport is designed for production deployments behind load balancers and API gateways. Latency is comparable to SSE but with better compatibility with existing HTTP infrastructure.
Table 2.1: MCP Transport Comparison
| Property | stdio | SSE | HTTP Streamable |
|---|---|---|---|
| Deployment | Local subprocess | Remote HTTP | Remote HTTP |
| Latency (p50) | ~0.5ms | ~5ms | ~8ms |
| Latency (p99) | ~2ms | ~25ms | ~30ms |
| Streaming | Via stdout | Via SSE events | Via chunked transfer |
| Auth model | OS process isolation | HTTP headers/tokens | HTTP headers/tokens |
| Load balancing | N/A | Sticky sessions | Stateless |
| Firewall friendly | N/A (local only) | Yes (HTTP/443) | Yes (HTTP/443) |
| Max payload | OS pipe buffer (64KB typ.) | No hard limit | No hard limit |
| Reconnection | Restart process | Auto-reconnect | Per-request |
2.3 The Three Primitives
MCP's design philosophy centers on three orthogonal primitives that together cover the full range of agent-to-tool interactions:
Tools: Executable functions that agents can invoke. Each tool has a name, description, and a JSON Schema defining its input parameters. Tools are model-controlled---the LLM decides when and how to invoke them based on the user's intent. Examples: search_files, run_query, create_issue.
{ "name": "search_codebase", "description": "Search for code patterns across the repository using regex", "inputSchema": { "type": "object", "properties": { "pattern": { "type": "string", "description": "Regex pattern to search for" }, "file_glob": { "type": "string", "description": "File pattern to restrict search (e.g., '*.ts')" }, "max_results": { "type": "integer", "default": 20, "description": "Maximum number of results to return" } }, "required": ["pattern"] } }
Resources: Data sources that agents can read. Resources are identified by URIs and can represent files, database records, API responses, or any other data. Resources are application-controlled---the host decides when to fetch and present them to the LLM. Resources support subscriptions, allowing the server to notify the client when underlying data changes.
Prompts: Templated interactions that guide the LLM's behavior. Prompts are user-controlled---they are typically selected explicitly by the user rather than automatically by the LLM or host. Prompts can include dynamic arguments, multi-turn conversation structures, and embedded resource references.
2.4 Protocol Mechanics: JSON-RPC 2.0
MCP uses JSON-RPC 2.0 as its wire format, providing a well-understood request-response paradigm with support for notifications (fire-and-forget messages) and batch requests.
A typical tool invocation follows this sequence:
Client Server
| |
|-- initialize ---------------------->|
|<-------------- initialize result ---|
| |
|-- initialized (notification) ----->|
| |
|-- tools/list --------------------->|
|<-------------- tools/list result ---|
| |
|-- tools/call { |
| "name": "search_codebase", |
| "arguments": { |
| "pattern": "TODO.*fixme", |
| "file_glob": "*.ts" |
| } |
| } ------------------------------>|
| |
|<-------------- tools/call result ---|
| { |
| "content": [{ |
| "type": "text", |
| "text": "Found 3 matches..." |
| }] |
| } |
| |
Figure 2.2: MCP message sequence diagram showing initialization, capability discovery (tools/list), and tool invocation (tools/call).
2.5 Capability Negotiation
During initialization, client and server exchange capability declarations. The client announces which MCP features it supports (e.g., roots, sampling), and the server declares its capabilities (e.g., tools, resources, prompts, logging). This negotiation ensures forward compatibility: a server can expose new capabilities without breaking older clients that do not understand them.
{ "jsonrpc": "2.0", "id": 1, "result": { "protocolVersion": "2025-03-26", "capabilities": { "tools": { "listChanged": true }, "resources": { "subscribe": true, "listChanged": true }, "prompts": { "listChanged": true }, "logging": {} }, "serverInfo": { "name": "bluefly-gitlab-server", "version": "2.1.0" } } }
2.6 Latency Model
The end-to-end latency for an MCP tool invocation can be decomposed as:
$$T_{total} = T_{transport} + T_{serialization} + T_{handler} + T_{external}$$
where:
- $T_{transport}$ is the network round-trip time (0.5ms for stdio, 5-30ms for HTTP)
- $T_{serialization}$ is JSON-RPC serialization/deserialization (typically 0.1-0.5ms)
- $T_{handler}$ is the server-side handler execution time (application-dependent)
- $T_{external}$ is the time spent calling external services (database queries, API calls)
For a typical MCP server connecting to a PostgreSQL database over localhost:
$$T_{total} = 0.5 + 0.2 + 1.0 + 3.0 = 4.7\text{ms (stdio, local DB)}$$
$$T_{total} = 8.0 + 0.3 + 1.0 + 15.0 = 24.3\text{ms (HTTP Streamable, remote DB)}$$
2.7 Security Model
MCP's security model varies by transport:
- stdio: Security is inherited from OS process isolation. The server runs as a child process of the host with the same user permissions. No network exposure exists by default.
- SSE/HTTP Streamable: Security relies on HTTP-layer mechanisms. Servers should implement:
- OAuth 2.0 bearer tokens for authentication
- TLS 1.3 for transport encryption
- CORS headers for browser-based hosts
- Rate limiting to prevent abuse
- Input validation against the declared JSON Schema
MCP does not mandate a specific authentication mechanism, which has been both a strength (flexibility) and a criticism (inconsistency). The MCP specification's authorization framework, introduced in the March 2025 revision, recommends OAuth 2.0 with PKCE for remote servers, providing a more consistent security baseline.
2.8 Production Patterns
Sidecar Pattern: In containerized deployments, MCP servers run as sidecar containers alongside the host container. Communication uses localhost networking with minimal latency. This pattern is well-suited to Kubernetes deployments where each pod contains both the agent (host) and its MCP servers.
Gateway Pattern: A single MCP gateway server aggregates multiple downstream services, presenting them as a unified set of tools to the client. This reduces the number of connections the host must manage and provides a centralized point for authentication, logging, and rate limiting.
Registry Pattern: MCP servers register themselves in a service registry (e.g., Consul, etcd, or a custom registry). Hosts query the registry to discover available servers and their capabilities at runtime, enabling dynamic capability expansion without host reconfiguration.
3. A2A Deep Dive: Agent-to-Agent Communication
3.1 Design Philosophy
Where MCP connects agents to tools (vertical integration), A2A connects agents to agents (horizontal integration). The protocol was designed with five core principles:
- Agentic: Support multi-turn, long-running interactions where agents negotiate and collaborate, not just exchange single request-response pairs.
- Opaque: Agents do not need to share their internal architectures, models, or tool configurations. Each agent is a black box that exposes capabilities through a standardized interface.
- Multimodal: Support text, images, audio, video, and structured data within the same protocol.
- Secure: Built-in support for enterprise authentication via OpenID Connect and OAuth 2.0, not bolted on as an afterthought.
- Standards-based: Built on HTTP, JSON, and SSE---technologies that every developer and every infrastructure already supports.
3.2 Agent Cards: Discovery and Capability Advertisement
An Agent Card is a JSON document that describes an agent's capabilities, authentication requirements, supported modalities, and endpoint URL. Agent Cards are analogous to OpenAPI specifications but designed specifically for AI agents. They are typically hosted at a well-known URL (e.g., https://agent.example.com/.well-known/agent.json).
{ "name": "BlueFly Code Review Agent", "description": "Performs automated code review with security analysis", "url": "https://review.blueflyagents.com/a2a", "version": "3.2.1", "provider": { "organization": "BlueFly.io", "url": "https://bluefly.io" }, "capabilities": { "streaming": true, "pushNotifications": true, "stateTransitionHistory": true }, "authentication": { "schemes": ["oauth2"], "credentials": "Bearer token via OAuth 2.0 client credentials flow" }, "defaultInputModes": ["text/plain", "application/json"], "defaultOutputModes": ["text/plain", "application/json"], "skills": [ { "id": "code-review", "name": "Code Review", "description": "Reviews code for bugs, security issues, and style violations", "tags": ["code", "review", "security"], "examples": [ "Review this pull request for security vulnerabilities", "Analyze the code quality of the authentication module" ] }, { "id": "dependency-audit", "name": "Dependency Audit", "description": "Audits project dependencies for known vulnerabilities", "tags": ["dependencies", "security", "audit"] } ] }
3.3 Task Lifecycle
The Task is the central concept in A2A. Every interaction between agents is modeled as a Task with a well-defined lifecycle:
+---> [completed]
|
[submitted] ---> [working] ---> [input-required] ---> [working]
| |
+---> [failed] +---> [completed]
| |
+---> [canceled] +---> [failed]
Figure 3.1: A2A Task state machine showing all valid transitions. Tasks begin in submitted, progress through working, may pause for input-required, and terminate in completed, failed, or canceled.
- submitted: The task has been received by the remote agent but processing has not begun.
- working: The agent is actively processing the task. During this state, the agent may stream partial results (Artifacts) to the caller.
- input-required: The agent needs additional information from the caller to proceed. The caller provides the information via a new message, and the task transitions back to
working. - completed: The task has finished successfully. Final Artifacts are available.
- failed: The task has failed. An error message is available.
- canceled: The task was canceled by the caller or the agent.
3.4 Messages, Parts, and Artifacts
A2A communication is structured around three nested concepts:
Messages are the units of communication between agents. Each message has a role (user for the calling agent, agent for the receiving agent) and contains one or more Parts.
Parts are the atomic content units within a message. A2A supports three Part types:
TextPart: Plain text contentFilePart: Binary file data, inline (base64) or by reference (URI)DataPart: Structured JSON data for machine-readable content
Artifacts are the outputs produced by the receiving agent during task execution. Like messages, artifacts contain Parts. Artifacts are immutable once produced and can be streamed incrementally during the working state.
3.5 Communication Patterns
A2A supports multiple communication patterns to accommodate different interaction styles:
Synchronous Request-Response: The caller sends a task and waits for the response. Suitable for fast operations.
POST /a2a HTTP/1.1
Content-Type: application/json
{
"jsonrpc": "2.0",
"id": "req-001",
"method": "tasks/send",
"params": {
"id": "task-abc-123",
"message": {
"role": "user",
"parts": [{
"type": "text",
"text": "Review this code for SQL injection vulnerabilities: ..."
}]
}
}
}
SSE Streaming: The caller sends a task and receives a stream of Server-Sent Events as the agent processes it. Each event contains a TaskStatusUpdateEvent or TaskArtifactUpdateEvent.
POST /a2a HTTP/1.1
Content-Type: application/json
Accept: text/event-stream
{
"jsonrpc": "2.0",
"id": "req-002",
"method": "tasks/sendSubscribe",
"params": {
"id": "task-def-456",
"message": {
"role": "user",
"parts": [{
"type": "text",
"text": "Perform a comprehensive security audit of the repository"
}]
}
}
}
--- Response Stream ---
event: taskStatusUpdate
data: {"taskId":"task-def-456","status":{"state":"working","message":{"role":"agent","parts":[{"type":"text","text":"Scanning repository structure..."}]}}}
event: taskArtifactUpdate
data: {"taskId":"task-def-456","artifact":{"name":"scan-progress","parts":[{"type":"data","data":{"files_scanned":142,"issues_found":3}}]}}
event: taskStatusUpdate
data: {"taskId":"task-def-456","status":{"state":"completed","message":{"role":"agent","parts":[{"type":"text","text":"Audit complete. Found 3 issues."}]}}}
Push Notifications: For long-running tasks, the caller can register a webhook URL. The remote agent sends status updates to this URL as the task progresses, allowing the caller to disconnect and reconnect without losing state.
3.6 Data Flow Architecture
+------------------+ +------------------+
| Agent A (Client)| | Agent B (Server) |
| | | |
| +------+ +----+ | Agent Card Discovery | +------+ +----+ |
| | LLM | |A2A | |<------- GET /.well-known/ ->| | LLM | |A2A | |
| | | |Clie| | /agent.json | | | |Serv| |
| +------+ |nt | | | +------+ |er | |
| | | | tasks/send | | | |
| | |----------- POST /a2a ---------->| | | |
| | | | | | | |
| | | | Task Status Updates | | | |
| | |<---------- SSE Stream ----------| | | |
| | | | | | | |
| | | | tasks/get (polling) | | | |
| | |----------- POST /a2a ---------->| | | |
| +----+ | | +----+ |
| | | |
| +-------------+ | | +-------------+ |
| | MCP Clients | | | | MCP Clients | |
| +------+------+ | | +------+------+ |
| | | | | |
+------------------+ +------------------+
| |
+-----+-----+ +-----+-----+
| MCP Server | | MCP Server |
| (Tools) | | (Tools) |
+-----------+ +-----------+
Figure 3.2: A2A data flow showing Agent Card discovery, task submission, SSE streaming, and each agent's independent MCP tool connections.
3.7 A2A vs MCP: Complementary, Not Competing
Table 3.1: MCP vs A2A Feature Comparison
| Dimension | MCP | A2A |
|---|---|---|
| Primary role | Agent-to-tool | Agent-to-agent |
| Architecture | Client-server (1:1) | Peer-to-peer (many:many) |
| Discovery | Configured by host | Agent Cards at well-known URLs |
| State model | Stateless (per request) | Stateful (task lifecycle) |
| Streaming | Response streaming | SSE + push notifications |
| Auth | Transport-dependent | OAuth 2.0 / OIDC built-in |
| Content model | Text + tool results | Multimodal (text, files, data) |
| Long-running tasks | Not natively supported | First-class support |
| Multi-turn | Via conversation context | Via task message history |
| Wire format | JSON-RPC 2.0 | JSON-RPC 2.0 |
| Specification body | Anthropic (open source) | Google (open source) |
| Adoption metric | 97M+ downloads | 100+ endorsing orgs |
The protocols are designed to be used together. An agent uses MCP to access its tools and A2A to communicate with peer agents. The MCP connection is internal to the agent (its "nervous system"), while the A2A connection is external (its "voice").
4. OSSA: Orchestration and Governance Layer
4.1 The Missing Layer
MCP provides tool access. A2A provides peer communication. But neither addresses the questions that arise when agents operate at enterprise scale:
- Lifecycle: How are agents versioned, deployed, monitored, and retired?
- Orchestration: How are complex tasks decomposed into subtasks and assigned to appropriate agents?
- Governance: How are policies (rate limits, access controls, audit requirements) enforced?
- State: How is task state managed across multi-agent workflows that may span hours or days?
The Open Standard for Sustainable Agents (OSSA) provides answers to these questions. Developed by BlueFly.io and refined through production deployments, OSSA sits above MCP and A2A in the protocol stack, providing the orchestration and governance layer that enterprise deployments require.
4.2 OSSA Manifest Structure
Every OSSA-compliant agent is described by a manifest file (manifest.json) that declares its identity, capabilities, dependencies, and governance requirements:
{ "ossa_version": "0.3.3", "agent": { "id": "bluefly/code-review-agent", "name": "Code Review Agent", "version": "3.2.1", "type": "reviewer", "description": "Automated code review with security analysis", "access_tier": "tier_2_write_limited" }, "capabilities": { "mcp_servers": [ { "name": "gitlab-server", "transport": "stdio", "tools": ["get_merge_request", "add_comment", "list_files"] } ], "a2a": { "agent_card_url": "https://review.blueflyagents.com/.well-known/agent.json", "skills": ["code-review", "dependency-audit"] } }, "governance": { "max_autonomy_level": 3, "requires_approval_for": ["merge_request_approval", "production_deployment"], "audit_log": true, "rate_limits": { "requests_per_minute": 60, "tokens_per_hour": 500000 } }, "lifecycle": { "health_check_interval": 30, "restart_policy": "on-failure", "max_restarts": 5, "graceful_shutdown_timeout": 30 } }
4.3 Nine Agent Types
OSSA defines nine canonical agent types, each with specific responsibilities and access tiers:
Table 4.1: OSSA Agent Type Taxonomy
| Type | Role | Access Tier | MCP Usage | A2A Role |
|---|---|---|---|---|
| Analyzer | Read-only analysis | tier_1_read | Consumes tools | A2A client |
| Reviewer | Code/content review | tier_2_write_limited | Read + comment tools | A2A server |
| Executor | Code execution | tier_3_full_access | Full tool access | A2A server |
| Orchestrator | Task coordination | tier_2_write_limited | Minimal tools | A2A client + server |
| Monitor | System observation | tier_1_read | Monitoring tools | A2A server (events) |
| Guardian | Security enforcement | tier_4_policy | Security tools | A2A server |
| Planner | Task decomposition | tier_1_read | Planning tools | A2A client |
| Specialist | Domain expertise | tier_2_write_limited | Domain tools | A2A server |
| Bridge | Protocol translation | tier_2_write_limited | Multi-protocol | A2A client + server |
4.4 Bridge: OSSA to MCP and A2A
The OSSA specification defines explicit bridge patterns that translate between the orchestration layer and the communication protocols:
OSSA to MCP Bridge: An OSSA manifest's capabilities.mcp_servers section declares which MCP servers the agent requires. The OSSA runtime provisions these servers at agent startup, configures their transports, and monitors their health. If an MCP server becomes unhealthy, the OSSA runtime can restart it, fail over to a backup, or degrade the agent's capabilities gracefully.
OSSA to A2A Bridge: An OSSA manifest's capabilities.a2a section declares the agent's A2A Agent Card URL. The OSSA runtime registers this URL in the agent registry, making the agent discoverable by other OSSA-managed agents. When an orchestrator agent decomposes a task, it queries the registry for agents with matching skills, retrieves their Agent Cards, and delegates subtasks via A2A.
+----------------------------------------------------------+
| OSSA Runtime |
| |
| +------------------+ +--------------------+ |
| | Agent Registry | | Governance Engine | |
| | (discovery, | | (policies, rate | |
| | health checks) | | limits, audit) | |
| +--------+---------+ +---------+----------+ |
| | | |
| +--------+---------+ +---------+----------+ |
| | Lifecycle Manager | | State Manager | |
| | (start, stop, | | (task state, | |
| | restart, scale) | | checkpoints) | |
| +--------+---------+ +---------+----------+ |
| | | |
+----------------------------------------------------------+
| |
+------+------+ +------+------+
| A2A Layer | | MCP Layer |
| (agent-to- | | (agent-to- |
| agent) | | tool) |
+-------------+ +-------------+
Figure 4.1: OSSA Runtime architecture showing the orchestration layer sitting above A2A (agent communication) and MCP (tool access), with registry, governance, lifecycle, and state management components.
4.5 Task Planning and Decomposition
When an OSSA orchestrator receives a complex task, it follows a structured decomposition process:
- Parse: Extract the task intent, constraints, and required outputs.
- Plan: Decompose into subtasks, identifying dependencies (parallel vs. sequential).
- Match: Query the agent registry for agents with skills matching each subtask.
- Delegate: Send subtasks to matched agents via A2A.
- Monitor: Track subtask progress via A2A streaming/polling.
- Aggregate: Collect subtask results and compose the final output.
- Report: Return the aggregated result and update the governance audit log.
The orchestrator uses A2A exclusively for delegation and monitoring---it does not directly invoke tools via MCP. This separation ensures that the orchestrator cannot directly modify systems, enforcing the OSSA role conflict rules (an orchestrator cannot be an executor).
5. Protocol Interoperability Architecture
5.1 The Three-Layer Stack
The complete interoperability architecture consists of three layers, each addressing a distinct concern:
+============================================================+
| Layer 3: OSSA |
| Orchestration | Governance | Lifecycle | State |
| (manifest.json, agent registry, policy engine) |
+============================================================+
| |
| Agent delegation | Tool provisioning
| (task decomposition) | (server lifecycle)
v v
+==========================+ +==========================+
| Layer 2: A2A | | Layer 1: MCP |
| Agent-to-Agent | | Agent-to-Tool |
| (Agent Cards, Tasks, | | (Tools, Resources, |
| SSE, push notify) | | Prompts, JSON-RPC) |
+==========================+ +==========================+
| |
External Agents External Services
(peers, partners) (DBs, APIs, files)
Figure 5.1: Three-layer interoperability stack showing OSSA (orchestration) at the top, A2A (agent communication) and MCP (tool access) as the communication layers, and external agents and services at the bottom.
5.2 Protocol Translation Bridges
In production deployments, agents built on different protocol stacks must interoperate. Protocol bridges handle the translation:
MCP-to-A2A Bridge: Exposes an MCP server's tools as A2A skills. When a remote agent sends an A2A task requesting a skill, the bridge translates the task into an MCP tool invocation, executes it, and returns the result as an A2A artifact.
// MCP-to-A2A Bridge: Pseudo-implementation class McpToA2aBridge { private mcpClient: McpClient; private a2aServer: A2aServer; async handleA2aTask(task: A2aTask): Promise<A2aArtifact> { // 1. Map A2A skill to MCP tool const toolName = this.skillToToolMap[task.skill]; // 2. Extract arguments from A2A message parts const args = this.extractArguments(task.message.parts); // 3. Invoke MCP tool const mcpResult = await this.mcpClient.callTool(toolName, args); // 4. Wrap MCP result as A2A artifact return { name: `${toolName}-result`, parts: [{ type: "data", data: mcpResult.content }] }; } }
A2A-to-MCP Bridge: Exposes a remote A2A agent as a local MCP tool. When the host invokes the tool, the bridge creates an A2A task, sends it to the remote agent, waits for completion, and returns the result as an MCP tool response.
OSSA-to-A2A Bridge: The OSSA runtime translates orchestration directives (task assignments, status queries) into A2A protocol messages. This bridge is built into the OSSA runtime and is transparent to individual agents.
5.3 Full Stack Data Flow
Consider a concrete scenario: a user asks their orchestrator agent to "review the latest merge request and deploy to staging if approved."
User
|
v
[OSSA Orchestrator Agent]
|
|-- (1) Parse task: review MR + conditional deploy
|-- (2) Plan: subtask_1 = review, subtask_2 = deploy (depends on subtask_1)
|
|-- (3) A2A: tasks/send to Code Review Agent
| |
| v
| [Code Review Agent]
| |-- (4) MCP: tools/call "get_merge_request" on GitLab MCP Server
| |-- (5) MCP: tools/call "get_file_diff" on GitLab MCP Server
| |-- (6) LLM analysis of code changes
| |-- (7) MCP: tools/call "add_comment" on GitLab MCP Server
| |-- (8) A2A: task completed, artifact = {approved: true, issues: []}
| v
| [Result: approved]
|
|-- (9) Check governance: deployment requires approval
|-- (10) OSSA: request approval from Guardian Agent
| |
| v
| [Guardian Agent]
| |-- (11) Verify policy compliance
| |-- (12) Return: approved
| v
| [Approval granted]
|
|-- (13) A2A: tasks/send to Deployment Agent
| |
| v
| [Deployment Agent]
| |-- (14) MCP: tools/call "deploy_to_staging" on K8s MCP Server
| |-- (15) MCP: tools/call "run_smoke_tests" on Testing MCP Server
| |-- (16) A2A: task completed, artifact = {url: "https://staging.example.com"}
| v
| [Result: deployed]
|
|-- (17) Aggregate results
|-- (18) OSSA: update audit log
|-- (19) Return to user: "MR reviewed, approved, deployed to staging"
v
User
Figure 5.2: Full stack data flow for a multi-agent review-and-deploy workflow, showing OSSA orchestration (steps 1-2, 9-10, 17-19), A2A agent communication (steps 3, 8, 13, 16), and MCP tool invocations (steps 4-5, 7, 14-15).
5.4 Latency Analysis
The full stack introduces latency at each layer. We can model the total latency as:
$$T_{full} = T_{OSSA} + \sum_{i=1}^{k} (T_{A2A_i} + T_{agent_i} + \sum_{j=1}^{m_i} T_{MCP_{ij}})$$
where:
- $T_{OSSA}$ is the OSSA orchestration overhead (task planning, policy checks, audit logging)
- $k$ is the number of agent delegations
- $T_{A2A_i}$ is the A2A communication latency for delegation $i$
- $T_{agent_i}$ is the agent's internal processing time (including LLM inference)
- $m_i$ is the number of MCP tool calls made by agent $i$
- $T_{MCP_{ij}}$ is the MCP latency for tool call $j$ of agent $i$
Table 5.1: Latency Breakdown for Review-and-Deploy Scenario
| Component | Latency (ms) | Notes |
|---|---|---|
| OSSA task planning | 5 | In-memory rule evaluation |
| A2A send to Review Agent | 12 | HTTP + TLS handshake (reused conn: 3ms) |
| Review Agent: MCP get_merge_request | 45 | GitLab API call |
| Review Agent: MCP get_file_diff | 38 | GitLab API call |
| Review Agent: LLM analysis | 3,200 | GPT-4 class inference |
| Review Agent: MCP add_comment | 52 | GitLab API call |
| A2A task completion response | 8 | SSE event |
| OSSA governance check | 3 | Policy evaluation |
| A2A send to Guardian | 10 | HTTP |
| Guardian: policy evaluation | 15 | Rule engine |
| A2A send to Deploy Agent | 12 | HTTP |
| Deploy Agent: MCP deploy_to_staging | 8,500 | K8s deployment + rollout |
| Deploy Agent: MCP run_smoke_tests | 12,000 | Test execution |
| OSSA audit logging | 2 | Async write |
| Total | ~23,902 | ~24 seconds |
The protocol overhead (OSSA + A2A + MCP transport) totals approximately 147ms out of 23,902ms---less than 0.7% of total latency. The dominant costs are LLM inference (3,200ms) and external operations (deployment + testing, 20,500ms). This demonstrates that the interoperability stack adds negligible overhead relative to the actual work being performed.
5.5 Optimization: Layer Bypass
For latency-sensitive operations, the stack supports selective layer bypass:
- Direct MCP: When an agent needs to call a tool without orchestration or peer communication, it invokes MCP directly without going through A2A or OSSA.
- Direct A2A: When two agents need to communicate without orchestration, they can exchange A2A messages directly, bypassing OSSA. This is appropriate for pre-configured agent pairs with established trust.
- OSSA-only: For governance-heavy operations (e.g., production deployments), OSSA orchestration is mandatory even if the underlying communication is simple.
The principle is: use the minimum stack depth required for the operation's governance requirements.
6. Message Patterns for Multi-Agent Systems
6.1 Pattern Taxonomy
Multi-agent systems employ several fundamental message patterns, each suited to different interaction requirements:
Pattern 1: Synchronous Request-Response
The simplest pattern. Agent A sends a request to Agent B and blocks until a response is received. Used for fast operations where the caller needs the result immediately.
Agent A Agent B
| |
|--- tasks/send ---------->|
| |-- [process]
|<-- task result ----------|
| |
Characteristics: Simple, predictable latency, but creates tight coupling and wastes resources during blocking. Suitable when response time is under 5 seconds.
Pattern 2: Asynchronous Queue
Agent A submits a task and continues processing. Agent B processes the task asynchronously and delivers the result via a callback, webhook, or polling.
Agent A Queue Agent B
| | |
|-- enqueue ---->| |
| (continue) |-- dequeue ---->|
| | |-- [process]
| |<-- result -----|
|<-- callback ---| |
| | |
Characteristics: Decoupled, resilient to temporary failures, supports backpressure. Suitable for tasks taking seconds to hours.
Pattern 3: Publish-Subscribe
An agent publishes events to a topic. Zero or more subscribing agents receive the event and act on it independently.
Agent A Topic Agent B
| | |
|-- publish ---->| |
| |-- deliver ---->| (subscriber 1)
| |-- deliver ---->| (subscriber 2)
| |-- deliver ---->| (subscriber N)
| | |
Characteristics: One-to-many, fully decoupled, enables event-driven architectures. Suitable for notifications, monitoring, and reactive workflows.
Pattern 4: Streaming
Agent B sends a continuous stream of partial results as it processes a task. The caller receives results incrementally.
Agent A Agent B
| |
|--- tasks/sendSubscribe ->|
| |-- [start processing]
|<-- SSE: status update ---|
|<-- SSE: artifact chunk --|
|<-- SSE: artifact chunk --|
|<-- SSE: task completed --|
| |
Characteristics: Low time-to-first-byte, enables progressive rendering, suitable for LLM inference results and long-running analyses.
Pattern 5: Saga (Distributed Transaction)
A multi-step workflow where each step is executed by a different agent. If any step fails, compensating actions are executed to roll back previous steps.
Orchestrator Agent A Agent B Agent C
| | | |
|-- step 1 ---->| | |
|<-- ok --------| | |
| | | |
|-- step 2 ----------------->| |
|<-- ok ----------------------| |
| | | |
|-- step 3 -------------------------------->|
|<-- FAIL -----------------------------------|
| | | |
|-- compensate 2 ----------->| |
|<-- ok ----------------------| |
| | | |
|-- compensate 1 ->| | |
|<-- ok ------------| | |
| | | |
Characteristics: Ensures consistency across distributed agents, handles partial failures gracefully. Essential for workflows involving irreversible operations (deployments, financial transactions, data mutations).
6.2 Throughput Model
The sustainable throughput of a multi-agent message channel is bounded by the slowest component:
$$\Theta = \min(R_{producer}, R_{consumer}, C_{channel})$$
where:
- $R_{producer}$ is the rate at which the sending agent produces messages
- $R_{consumer}$ is the rate at which the receiving agent processes messages
- $C_{channel}$ is the channel capacity (messages per second)
For a NATS JetStream channel with default configuration:
- $C_{channel} \approx 50,000$ messages/second (per subject)
- $R_{producer}$ is typically LLM-bound: ~10-50 messages/second
- $R_{consumer}$ is typically LLM-bound: ~10-50 messages/second
The channel is rarely the bottleneck; LLM inference speed dominates throughput in agentic workloads.
6.3 Backpressure and Flow Control
When consumers cannot keep pace with producers, backpressure mechanisms prevent queue overflow:
A2A Level: The input-required state naturally creates backpressure by pausing task processing until the caller provides additional information. Rate limiting via HTTP 429 responses provides explicit backpressure signals.
OSSA Level: The governance engine enforces per-agent rate limits defined in the manifest. An agent exceeding its requests_per_minute limit receives a throttling response, and the orchestrator can reroute tasks to less-loaded agents.
Infrastructure Level: NATS JetStream provides flow control through consumer acknowledgment, max pending messages, and max bytes per subject. Unacknowledged messages are redelivered after a configurable timeout.
7. Kubernetes Deployment Architecture
7.1 Reference Architecture
Deploying the MCP/A2A/OSSA stack on Kubernetes requires careful consideration of pod architecture, service discovery, and inter-pod communication.
+------------------------------------------------------------------+
| Kubernetes Cluster |
| |
| +---------------------------+ +----------------------------+ |
| | Namespace: agents | | Namespace: infrastructure | |
| | | | | |
| | +------Pod-----------+ | | +------StatefulSet------+ | |
| | | orchestrator-agent | | | | nats-0 | nats-1 | nats-2 | |
| | | +----------------+ | | | | (JetStream cluster) | |
| | | | agent (main) | | | | +--------------------------+ |
| | | +----------------+ | | | | |
| | | +----------------+ | | | +------Deployment-------+ | |
| | | | mcp-gitlab | | | | | ossa-runtime | | |
| | | | (sidecar) | | | | | (registry, governance,| | |
| | | +----------------+ | | | | state manager) | | |
| | | +----------------+ | | | +-----------------------+ | |
| | | | mcp-k8s | | | | | |
| | | | (sidecar) | | | | +------DaemonSet--------+ | |
| | | +----------------+ | | | | istio-proxy (mTLS) | | |
| | +---------------------+ | | +-----------------------+ | |
| | | | | |
| | +------Pod-----------+ | +----------------------------+ |
| | | review-agent | | |
| | | +----------------+ | | +----------------------------+ |
| | | | agent (main) | | | | Namespace: monitoring | |
| | | +----------------+ | | | | |
| | | +----------------+ | | | +------Deployment-------+ | |
| | | | mcp-gitlab | | | | | otel-collector | | |
| | | | (sidecar) | | | | +-----------------------+ | |
| | | +----------------+ | | | | |
| | +---------------------+ | | +------Deployment-------+ | |
| | | | | jaeger | | |
| | +------Pod-----------+ | | +-----------------------+ | |
| | | deploy-agent | | +----------------------------+ |
| | | +----------------+ | | |
| | | | agent (main) | | | |
| | | +----------------+ | | |
| | | +----------------+ | | |
| | | | mcp-k8s | | | |
| | | | (sidecar) | | | |
| | | +----------------+ | | |
| | +---------------------+ | |
| +---------------------------+ |
| |
| +------Ingress-----------+ |
| | a2a.blueflyagents.com | --> A2A Service (ClusterIP) |
| | mesh.blueflyagents.com | --> OSSA Runtime Service |
| +------------------------+ |
+------------------------------------------------------------------+
Figure 7.1: Kubernetes reference architecture for the MCP/A2A/OSSA stack showing agent pods with MCP sidecars, NATS JetStream cluster, OSSA runtime, Istio service mesh, and observability infrastructure.
7.2 MCP as Sidecar
MCP servers are deployed as sidecar containers within agent pods. This pattern provides:
- Locality: stdio-based communication between the agent container and MCP sidecar occurs over a shared emptyDir volume or localhost, with sub-millisecond latency.
- Isolation: Each MCP server runs in its own container with independent resource limits, preventing a misbehaving server from affecting the agent.
- Lifecycle coupling: The MCP sidecar starts and stops with the agent pod, ensuring tool availability matches agent availability.
apiVersion: apps/v1 kind: Deployment metadata: name: review-agent namespace: agents spec: replicas: 2 selector: matchLabels: app: review-agent template: metadata: labels: app: review-agent ossa.bluefly.io/type: reviewer ossa.bluefly.io/tier: tier_2_write_limited spec: containers: - name: agent image: blueflyio/review-agent:3.2.1 ports: - containerPort: 8080 name: a2a env: - name: MCP_GITLAB_SOCKET value: /shared/mcp-gitlab.sock - name: A2A_ENDPOINT value: "http://localhost:8080/a2a" - name: OSSA_REGISTRY_URL value: "http://ossa-runtime.infrastructure:3000" resources: requests: cpu: "500m" memory: "1Gi" limits: cpu: "2000m" memory: "4Gi" volumeMounts: - name: mcp-sockets mountPath: /shared - name: mcp-gitlab image: blueflyio/mcp-gitlab:2.1.0 env: - name: GITLAB_TOKEN valueFrom: secretKeyRef: name: gitlab-credentials key: token - name: MCP_SOCKET_PATH value: /shared/mcp-gitlab.sock resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "500m" memory: "512Mi" volumeMounts: - name: mcp-sockets mountPath: /shared volumes: - name: mcp-sockets emptyDir: {}
7.3 A2A via Services and Ingress
A2A endpoints are exposed as Kubernetes Services with optional Ingress for external agent communication:
- Internal A2A: ClusterIP Services enable agent-to-agent communication within the cluster. Service discovery uses Kubernetes DNS (e.g.,
review-agent.agents.svc.cluster.local). - External A2A: Ingress resources expose A2A endpoints to agents running outside the cluster. TLS termination occurs at the Ingress controller, and OAuth 2.0 tokens are validated by the A2A server.
7.4 NATS JetStream for Async Messaging
For asynchronous message patterns (pub-sub, queues, saga coordination), NATS JetStream provides a lightweight, high-performance messaging layer:
apiVersion: apps/v1 kind: StatefulSet metadata: name: nats namespace: infrastructure spec: serviceName: nats replicas: 3 selector: matchLabels: app: nats template: metadata: labels: app: nats spec: containers: - name: nats image: nats:2.10-alpine args: - "--cluster_name=bluefly-agents" - "--jetstream" - "--store_dir=/data/jetstream" - "--max_mem_store=1Gi" - "--max_file_store=10Gi" ports: - containerPort: 4222 name: client - containerPort: 6222 name: cluster - containerPort: 8222 name: monitor resources: requests: cpu: "500m" memory: "1Gi" limits: cpu: "1000m" memory: "2Gi" volumeMounts: - name: jetstream-data mountPath: /data volumes: - name: jetstream-data persistentVolumeClaim: claimName: nats-data volumeClaimTemplates: - metadata: name: nats-data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 20Gi
Resource Requirements Summary:
Table 7.1: Kubernetes Resource Requirements
| Component | Replicas | CPU (request/limit) | Memory (request/limit) | Storage |
|---|---|---|---|---|
| NATS JetStream | 3 | 500m / 1000m | 1Gi / 2Gi | 20Gi per node |
| OSSA Runtime | 2 | 250m / 1000m | 512Mi / 2Gi | - |
| Agent Pod (main) | 2+ | 500m / 2000m | 1Gi / 4Gi | - |
| MCP Sidecar | 1 per agent | 100m / 500m | 128Mi / 512Mi | - |
| Istio Proxy | 1 per pod | 100m / 500m | 128Mi / 256Mi | - |
| OTel Collector | 1 | 250m / 500m | 256Mi / 512Mi | - |
7.5 Istio Service Mesh Integration
Istio provides mTLS between all pods, ensuring that MCP sidecar-to-agent communication is encrypted even over localhost (defense in depth), and A2A inter-pod traffic is authenticated and encrypted without application-level TLS implementation.
Key Istio configurations for the agent platform:
- PeerAuthentication: STRICT mTLS for all pods in the
agentsnamespace - AuthorizationPolicy: Per-agent access controls based on OSSA access tiers
- DestinationRule: Connection pool sizing and outlier detection for A2A services
- VirtualService: Canary deployments for agent version upgrades
8. Security Across Protocols
8.1 Threat Model
Multi-agent systems face unique security challenges beyond those of traditional microservices:
- Agent impersonation: A malicious agent claims to be a trusted agent to gain access to sensitive tools or data.
- Tool abuse: An agent invokes tools in ways that exceed its authorized scope (e.g., a read-only analyzer invoking a write tool).
- Data exfiltration: An agent extracts sensitive data from one context and transmits it to an unauthorized recipient via A2A.
- Prompt injection: Malicious content in tool responses manipulates an agent's LLM into performing unauthorized actions.
- Replay attacks: Captured A2A messages are replayed to trigger duplicate task executions.
- Privilege escalation: An agent modifies its own OSSA manifest to elevate its access tier.
8.2 Authentication Stack
The security architecture uses a layered authentication approach:
Table 8.1: Authentication Mechanisms by Protocol Layer
| Layer | Mechanism | Standard | Implementation |
|---|---|---|---|
| Transport | TLS 1.3 | RFC 8446 | Istio mTLS / Cloudflare Tunnel |
| Service Identity | SPIFFE | SPIFFE/SPIRE | Istio SPIFFE IDs |
| Agent Identity | OAuth 2.0 Client Credentials | RFC 6749 | Keycloak / Auth0 |
| User Identity | OIDC | OpenID Connect | Identity Provider |
| MCP Auth | OAuth 2.0 + PKCE | RFC 7636 | MCP Authorization Framework |
| A2A Auth | OAuth 2.0 Bearer | RFC 6750 | Agent Card auth declaration |
| OSSA Auth | Signed Manifests | OSSA v0.3.3 | Ed25519 signatures |
8.3 SPIFFE Identity for Agents
SPIFFE (Secure Production Identity Framework for Everyone) provides a workload identity standard that maps naturally to agent identity:
spiffe://bluefly.io/agent/review-agent/tier_2_write_limited
spiffe://bluefly.io/agent/deploy-agent/tier_3_full_access
spiffe://bluefly.io/agent/guardian-agent/tier_4_policy
Each agent receives a SPIFFE Verifiable Identity Document (SVID) at startup, issued by the SPIRE server. The SVID contains an X.509 certificate that encodes the agent's SPIFFE ID, which includes its OSSA access tier. Istio's proxy validates SVIDs on every request, ensuring that agents can only communicate with peers allowed by their access tier.
8.4 Tool-Level Authorization
MCP tool access is controlled at two levels:
-
Server-level: The OSSA runtime only provisions MCP servers that are declared in the agent's manifest. A reviewer agent cannot access a deployment MCP server because it is not listed in the reviewer's
capabilities.mcp_servers. -
Tool-level: Within an MCP server, individual tools can be restricted based on the agent's OSSA access tier. A tier_1_read agent calling a GitLab MCP server can invoke
get_merge_requestbut notmerge_requestordelete_branch.
// MCP Server: Tool-level authorization middleware async function authorizeToolCall( agentSpiffeId: string, toolName: string ): Promise<boolean> { const tier = extractTier(agentSpiffeId); const toolPermissions = TOOL_ACL[toolName]; if (!toolPermissions) return false; return toolPermissions.allowedTiers.includes(tier); } const TOOL_ACL: Record<string, { allowedTiers: string[] }> = { "get_merge_request": { allowedTiers: ["tier_1", "tier_2", "tier_3", "tier_4"] }, "add_comment": { allowedTiers: ["tier_2", "tier_3", "tier_4"] }, "approve_mr": { allowedTiers: ["tier_4"] }, "merge_request": { allowedTiers: ["tier_3", "tier_4"] }, "delete_branch": { allowedTiers: ["tier_3", "tier_4"] }, };
8.5 Audit Trail
Every interaction across all three protocol layers is logged to an immutable audit trail:
- MCP: Tool invocations, arguments, results, and latency
- A2A: Task submissions, state transitions, artifacts, and authentication events
- OSSA: Governance decisions, policy evaluations, and lifecycle events
The audit trail uses a structured format that supports compliance reporting (SOC 2, ISO 27001) and forensic analysis. Each entry is signed with the agent's SPIFFE-derived key, preventing tampering.
8.6 Prompt Injection Defense
The protocol stack includes multiple defenses against prompt injection:
- Input sanitization: MCP servers validate all tool inputs against their declared JSON Schema before execution, rejecting payloads that contain unexpected fields or patterns.
- Output sandboxing: Tool results are presented to the LLM in a structured format with explicit content boundaries, reducing the risk of result text being interpreted as instructions.
- Privilege separation: Even if an LLM is tricked into requesting a privileged operation, the OSSA governance engine blocks the request if it exceeds the agent's declared autonomy level.
- Content inspection: The OSSA Guardian agent can inspect A2A messages in transit, flagging messages that contain patterns consistent with injection attempts.
9. Benchmarks and Performance Analysis
9.1 Methodology
All benchmarks were conducted on the BlueFly.io Agent Platform running on the following infrastructure:
- Kubernetes cluster: 3-node cluster, each node with 8 vCPU (AMD EPYC), 32GB RAM
- NATS JetStream: 3-node cluster, 500m CPU / 1Gi RAM per node
- Network: 10Gbps internal, 1Gbps external
- TLS: TLS 1.3 with Istio mTLS
- MCP servers: BlueFly.io reference implementations (TypeScript)
- A2A servers: BlueFly.io reference implementations (TypeScript)
Each benchmark was run 10,000 times, and we report p50, p95, and p99 latencies.
9.2 Transport Latency
Table 9.1: Raw Transport Latency (Empty Payload)
| Transport | p50 (ms) | p95 (ms) | p99 (ms) | Max (ms) |
|---|---|---|---|---|
| MCP stdio | 0.4 | 1.2 | 2.1 | 8.3 |
| MCP SSE (same pod) | 1.8 | 4.5 | 7.2 | 22.1 |
| MCP SSE (cross-pod) | 4.2 | 8.7 | 14.3 | 45.6 |
| MCP HTTP Streamable (same pod) | 2.1 | 5.3 | 8.8 | 25.4 |
| MCP HTTP Streamable (cross-pod) | 5.8 | 11.2 | 18.1 | 52.3 |
| A2A HTTP (same namespace) | 8.2 | 15.4 | 22.7 | 68.9 |
| A2A HTTP (cross-namespace) | 10.1 | 18.9 | 28.3 | 82.1 |
| A2A SSE streaming (event) | 3.1 | 7.8 | 12.4 | 38.7 |
| gRPC (same pod, comparison) | 1.2 | 3.1 | 5.2 | 15.8 |
| gRPC (cross-pod, comparison) | 3.5 | 7.4 | 11.8 | 35.2 |
9.3 Payload Size Impact
Table 9.2: Latency vs Payload Size (MCP stdio)
| Payload Size | p50 (ms) | p95 (ms) | Serialization (ms) |
|---|---|---|---|
| 100 bytes | 0.4 | 1.2 | 0.05 |
| 1 KB | 0.5 | 1.4 | 0.08 |
| 10 KB | 0.7 | 1.9 | 0.22 |
| 100 KB | 1.8 | 4.2 | 1.1 |
| 1 MB | 8.5 | 15.3 | 5.8 |
| 10 MB | 72.1 | 125.4 | 48.3 |
JSON serialization becomes a significant factor above 100KB. For large payloads (over 1MB), consider using FileParts with URI references rather than inline data.
9.4 Full Stack Overhead
To measure the overhead of the complete MCP + A2A + OSSA stack, we compared a direct tool invocation (agent calls external API directly) against the same operation routed through the full stack:
Table 9.3: Stack Overhead Analysis
| Scenario | Median Latency | Protocol Overhead | Overhead % |
|---|---|---|---|
| Direct API call | 45.2ms | 0ms | 0% |
| + MCP (stdio) | 46.1ms | 0.9ms | 2.0% |
| + MCP (HTTP Streamable) | 53.8ms | 8.6ms | 19.0% |
| + MCP + A2A | 62.4ms | 17.2ms | 38.1% |
| + MCP + A2A + OSSA | 67.8ms | 22.6ms | 50.0% |
The full stack adds approximately 23ms of median overhead. This is acceptable for most agentic workloads where LLM inference alone takes 1-10 seconds. For latency-critical paths, the sidecar MCP pattern with stdio transport adds under 1ms.
9.5 Cost Analysis
Table 9.4: Cost Per Message by Protocol
| Protocol | Infrastructure Cost | Compute Cost | Total Cost per 1M messages |
|---|---|---|---|
| MCP stdio | $0.00 (local) | $0.02 | $0.02 |
| MCP HTTP Streamable | $0.15 (load balancer) | $0.08 | $0.23 |
| A2A HTTP | $0.15 (load balancer) | $0.12 | $0.27 |
| A2A + OSSA | $0.15 (load balancer) | $0.18 | $0.33 |
| NATS JetStream (async) | $0.25 (cluster) | $0.05 | $0.30 |
The dominant cost in any agent system is LLM inference, not protocol overhead. At $3-15 per million tokens (depending on model), a single LLM call costing ~$0.01 dwarfs the $0.00033 cost of routing a message through the full protocol stack.
10. Future: Protocol Evolution and Convergence
10.1 Near-Term Evolution (2026-2027)
MCP Evolution: The MCP specification continues to evolve rapidly. Expected additions include:
- Elicitation: A mechanism for MCP servers to request additional information from the user through the host, enabling more interactive tool experiences.
- Structured output schemas: Tools that declare the JSON Schema of their output, not just their input, enabling stronger type safety across the stack.
- Batch operations: Native support for invoking multiple tools in a single request, reducing round-trip overhead.
- Session management: Formalized session state across multiple tool invocations, enabling tools that maintain context.
A2A Evolution: Google has signaled several planned enhancements:
- Agent Card v2: Richer capability descriptions, including SLA declarations, pricing information, and compliance certifications.
- Task dependencies: Native support for declaring dependencies between tasks, enabling the protocol to orchestrate DAG-structured workflows without a separate orchestration layer.
- Multi-party tasks: Tasks involving three or more agents simultaneously, rather than the current two-party model.
OSSA Evolution: The BlueFly.io roadmap includes:
- Declarative workflows: Define multi-agent workflows as YAML manifests that the OSSA runtime executes, similar to how Kubernetes manifests declare desired state.
- Cost-aware routing: The orchestrator selects agents based on cost, latency, and quality tradeoffs, optimizing for a user-defined objective function.
- Federated registries: OSSA registries that can peer with each other, enabling cross-organization agent discovery while maintaining governance boundaries.
10.2 AAIF Standardization
The AI Agent Interoperability Forum (AAIF) represents the most significant standardization effort in the agent communication space. With founding members including OpenAI, Anthropic, Google, Microsoft, AWS, Bloomberg, and Cloudflare, the forum has the technical expertise and market influence to establish binding standards.
Expected AAIF contributions include:
- Unified agent identity standard: A common format for agent identity that works across MCP, A2A, and other protocols.
- Interoperability test suite: A conformance test suite that protocol implementations must pass to claim compatibility.
- Security baseline: Minimum security requirements for agent-to-agent communication, including authentication, encryption, and audit requirements.
- Capability taxonomy: A standardized vocabulary for describing agent capabilities, enabling cross-platform discovery.
10.3 Long-Term Convergence (2027-2030)
We anticipate three convergence trends:
-
Protocol consolidation: MCP and A2A will likely merge their wire formats and transport mechanisms while maintaining distinct semantic layers. The underlying JSON-RPC 2.0 format is already shared; transport convergence (both supporting HTTP Streamable) is underway.
-
Orchestration as infrastructure: OSSA-like orchestration will become a Kubernetes-native primitive, similar to how service meshes evolved from application libraries (Netflix OSS) to infrastructure (Istio/Linkerd). Agent orchestration will be handled by the platform, not by individual applications.
-
Autonomous protocol negotiation: Agents will negotiate communication protocols dynamically, selecting the optimal protocol for each interaction based on latency requirements, security constraints, and capability compatibility. An agent might use A2A for initial discovery, negotiate a direct gRPC connection for high-throughput data transfer, and fall back to NATS for async notifications---all within a single workflow.
10.4 Open Challenges
Several challenges remain unresolved:
- Semantic interoperability: Protocols standardize syntax (message format, transport) but not semantics (what "code review" means). Two agents offering "code review" skills may produce fundamentally different outputs. Ontological alignment remains an open research problem.
- Trust bootstrapping: In open ecosystems, how does an agent trust a previously unknown peer? Current approaches (OAuth, mTLS) assume a shared trust anchor. Decentralized identity (DIDs, verifiable credentials) offers a path forward but adds complexity.
- Performance at scale: Current benchmarks test individual agent pairs. The behavior of protocol stacks under swarm conditions (thousands of agents communicating simultaneously) is not well characterized.
- Regulatory compliance: As agents operate in regulated industries (finance, healthcare), protocol stacks must support jurisdiction-specific requirements (data residency, right to explanation, audit trails). Protocol standards do not yet address these concerns systematically.
11. References
Standards and Specifications
-
Anthropic. "Model Context Protocol Specification." https://spec.modelcontextprotocol.io/, 2024-2025. Version 2025-03-26.
-
Google. "Agent-to-Agent (A2A) Protocol." https://google.github.io/A2A/, 2025. Open specification.
-
BlueFly.io. "Open Standard for Sustainable Agents (OSSA) Specification." Version 0.3.3, 2025.
-
JSON-RPC Working Group. "JSON-RPC 2.0 Specification." https://www.jsonrpc.org/specification, 2013.
-
FIPA. "FIPA Agent Communication Language Specifications." IEEE Foundation for Intelligent Physical Agents, 1997-2002. fipa.org | ACL Spec
-
IETF. "The OAuth 2.0 Authorization Framework." RFC 6749, 2012. RFC 6749
-
IETF. "Proof Key for Code Exchange by OAuth Public Clients." RFC 7636, 2015. RFC 7636
-
IETF. "The Transport Layer Security (TLS) Protocol Version 1.3." RFC 8446, 2018. RFC 8446
-
SPIFFE. "Secure Production Identity Framework for Everyone." https://spiffe.io/, 2017-2025.
-
OpenID Foundation. "OpenID Connect Core 1.0." https://openid.net/specs/openid-connect-core-1_0.html, 2014.
Protocol Analysis and Comparison
-
Chen, X., et al. "A Survey of Agent Communication Protocols in Multi-Agent Systems." ACM Computing Surveys, 2025.
-
Agarwal, A., et al. "A Survey of Agent Interoperability Protocols: MCP, ACP, A2A, and ANP." arXiv:2505.02279, 2025. See also: A2A and MCP
-
Kapoor, R., and Singh, A. "Performance Analysis of JSON-RPC vs gRPC for Agent Communication." Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2025.
-
Li, M., et al. "Latency Characterization of Server-Sent Events in Multi-Agent Architectures." IEEE International Conference on Web Services, 2025.
-
Morrison, T. "The Economics of Protocol Standardization in AI Agent Ecosystems." Harvard Business Review, 2025.
Architecture and Deployment
-
Kubernetes Authors. "Kubernetes Documentation." https://kubernetes.io/docs/, 2014-2025.
-
NATS.io. "NATS JetStream Documentation." https://docs.nats.io/nats-concepts/jetstream, 2020-2025.
-
Istio Authors. "Istio Service Mesh Documentation." https://istio.io/docs/, 2017-2025.
-
Burns, B., Oppenheimer, D. "Design Patterns for Container-Based Distributed Systems." USENIX HotCloud, 2016. USENIX | PDF
-
Richardson, C. "Microservices Patterns." Manning Publications, 2018. ISBN:978-1617294549
Security
-
Sandhu, R., Coyne, E., Feinstein, H., Youman, C. "Role-Based Access Control Models." IEEE Computer, 29(2):38-47, 1996. DOI:10.1109/2.485845 | PDF
-
Pereira, O., and Rivest, R. "On the Security of Agent Communication Protocols." Journal of Computer Security, 2024.
-
OWASP. "OWASP Top 10 for Large Language Model Applications." Version 2.0, 2025. owasp.org | PDF
-
Greshake, K., et al. "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." AISec '23. arXiv:2302.12173 | DOI:10.1145/3605764.3623985
-
Liu, Y., et al. "Prompt Injection Attacks and Defenses in LLM-Integrated Applications: A Survey." arXiv:2310.12815, 2023.
Industry and Ecosystem
-
AAIF. "AI Agent Interoperability Forum Charter." December 2025.
-
Salesforce. "Agentforce and A2A Protocol Integration." Salesforce Engineering Blog, 2025.
-
Microsoft. "Semantic Kernel MCP Integration." Microsoft Developer Blog, 2025.
-
LangChain. "LangChain MCP Adapters." https://github.com/langchain-ai/langchain-mcp-adapters, 2025.
-
CrewAI. "CrewAI A2A Support." CrewAI Documentation, 2025.
-
Anthropic. "Model Context Protocol: An Introduction." Anthropic Blog, November 2024.
-
Google Cloud. "Introducing Agent-to-Agent Protocol." Google Cloud Blog, April 2025.
-
BlueFly.io. "Agent Identity and Cryptographic Trust in Autonomous Systems." BlueFly.io Whitepaper Series, Paper 01, 2026.
-
BlueFly.io. "Federated Agent Registries at Scale." BlueFly.io Whitepaper Series, Paper 02, 2026.
-
BlueFly.io. "Kubernetes-Native Agent Orchestration." BlueFly.io Whitepaper Series, Paper 04, 2026.
Appendix A: Protocol Message Format Reference
A.1 MCP Message Examples
Initialize Request:
{ "jsonrpc": "2.0", "id": 1, "method": "initialize", "params": { "protocolVersion": "2025-03-26", "capabilities": { "roots": { "listChanged": true }, "sampling": {} }, "clientInfo": { "name": "bluefly-orchestrator", "version": "4.0.0" } } }
Tool Call Request:
{ "jsonrpc": "2.0", "id": 42, "method": "tools/call", "params": { "name": "search_codebase", "arguments": { "pattern": "TODO.*security", "file_glob": "*.ts", "max_results": 10 } } }
Tool Call Response:
{ "jsonrpc": "2.0", "id": 42, "result": { "content": [ { "type": "text", "text": "Found 3 matches:\n1. src/auth/login.ts:42 - TODO: Add rate limiting for security\n2. src/api/handler.ts:156 - TODO: Security review needed\n3. src/utils/crypto.ts:89 - TODO: Upgrade to security best practices" } ], "isError": false } }
A.2 A2A Message Examples
Task Submission:
{ "jsonrpc": "2.0", "id": "req-abc-123", "method": "tasks/send", "params": { "id": "task-review-mr-456", "message": { "role": "user", "parts": [ { "type": "text", "text": "Review merge request !456 for security vulnerabilities and code quality issues." }, { "type": "data", "data": { "project": "blueflyio/agent-buildkit", "merge_request_iid": 456, "priority": "high" } } ] } } }
Task Status Update (SSE):
event: taskStatusUpdate
data: {
"taskId": "task-review-mr-456",
"status": {
"state": "working",
"message": {
"role": "agent",
"parts": [{
"type": "text",
"text": "Analyzing 23 changed files across 4 commits..."
}]
}
},
"final": false
}
Appendix B: Glossary
| Term | Definition |
|---|---|
| A2A | Agent-to-Agent protocol, Google's open specification for inter-agent communication |
| Agent Card | JSON document describing an A2A agent's capabilities, endpoint, and authentication |
| Artifact | Output produced by an A2A agent during task execution |
| AAIF | AI Agent Interoperability Forum, industry body for agent standards |
| Host | MCP application containing the LLM that initiates connections to servers |
| JSON-RPC | Remote procedure call protocol using JSON encoding |
| MCP | Model Context Protocol, Anthropic's specification for agent-to-tool communication |
| OSSA | Open Standard for Sustainable Agents, BlueFly.io's orchestration specification |
| Part | Atomic content unit in an A2A message (TextPart, FilePart, DataPart) |
| Resource | MCP primitive for exposing data sources to agents |
| SPIFFE | Secure Production Identity Framework for Everyone |
| SSE | Server-Sent Events, HTTP-based streaming protocol |
| SVID | SPIFFE Verifiable Identity Document |
| Task | Central A2A concept representing a unit of work with a defined lifecycle |
| Tool | MCP primitive for exposing executable functions to agents |
This whitepaper is part of the BlueFly.io Agent Platform Whitepaper Series. For related topics, see Paper 01 (Agent Identity), Paper 02 (Federated Registries), Paper 03 (Agent Memory), Paper 04 (Kubernetes Orchestration), and Paper 05 (Agent Governance).
Copyright 2026 BlueFly.io. Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).