A2A Protocol

AgentMaster Multi-Agent Conversational Framework - Multimodal Information Retrieval System Based on A2A and MCP Protocols

MILO
Share
AgentMaster Multi-Agent Conversational Framework - Multimodal Information Retrieval System Based on A2A and MCP Protocols

🎯 Key Points (TL;DR)

  • Innovative Framework: AgentMaster is the first multi-agent system that simultaneously integrates A2A and MCP protocols
  • Multimodal Support: Supports intelligent processing of various input formats including text, images, and audio
  • High Performance: BERTScore F1 reaches 96.3%, G-Eval score 87.1%
  • Practical Value: Users can interact with the system through natural language without technical background
  • Open Source Deployment: Supports local and AWS cloud deployment based on Flask microservice architecture

Table of Contents

  1. What is the AgentMaster Framework
  2. Core Technical Architecture Analysis
  3. A2A and MCP Protocol Details
  4. Multi-Agent Collaboration Mechanism
  5. Experimental Results and Performance Evaluation
  6. Real-World Application Case Studies
  7. System Limitations Analysis
  8. Technical Deployment and Implementation
  9. Frequently Asked Questions
  10. Summary and Outlook

What is the AgentMaster Framework {#what-is-agentmaster}

AgentMaster is a next-generation multi-agent conversational framework jointly developed by Stanford University and George Mason University, which pioneering integrates Anthropic's Model Context Protocol (MCP) and Google's Agent-to-Agent communication protocol (A2A) in a single system.

Core Innovations

  • Unified Conversational Interface: Users can interact with the system through natural language without professional technical knowledge
  • Dynamic Task Decomposition: Automatically decomposes complex queries into executable subtasks
  • Intelligent Routing Mechanism: Automatically selects the most suitable specialized agents based on task characteristics
  • Multimodal Processing: Supports various data formats including text, images, charts, and audio

AgentMaster Overall Architecture Figure 1: AgentMaster's General Multi-Agent System Framework

💡 Technical Breakthrough

This is the first multi-agent system to simultaneously implement A2A and MCP protocols in a single framework, filling a technical gap in this field.

Core Technical Architecture Analysis {#system-architecture}

AgentMaster adopts a four-layer architecture design, with each layer having clear responsibility divisions:

1. Unified Conversational Interface Layer

  • Multimodal Input: Supports text, charts, images, and audio input
  • Intelligent Output: Generates text, images, structured data tables, and other formats
  • User-Friendly: Chatbot-like interactive experience

2. Multi-Agent Hub

The system contains three levels of agents:

Agent Type Main Responsibilities Technical Features
Coordinator Agent Task decomposition, execution coordination Central controller responsible for overall scheduling
Domain Agents Specialized function processing Can be based on LLM or non-LLM technologies
General Agents General reasoning tasks Each equipped with dedicated LLM

System Architecture Diagram Figure 2: Case Study System Architecture

3. Multi-Agent AI Protocol Layer

  • A2A Protocol: Implements structured communication between agents
  • MCP Protocol: Provides unified interface for tool access and context management

4. State Management Layer

  • Vector Database: Provides persistent semantic memory
  • Context Cache: Rapid storage of session data and intermediate results

A2A and MCP Protocol Details {#protocols-explained}

Agent-to-Agent (A2A) Protocol

The A2A protocol is an inter-agent communication standard launched by Google in May 2025:

Core Functions

  • Structured Message Exchange: Standardized communication based on JSON format
  • Task Distribution Mechanism: Supports parallel or sequential execution of subtasks
  • Shared Understanding Construction: Multi-agent collaboration to solve complex problems

Technical Advantages

{
  "message_type": "task_delegation",
  "sender": "coordinator_agent",
  "receiver": "sql_agent",
  "task": "query_bridge_data",
  "parameters": {...}
}

Model Context Protocol (MCP)

MCP is a model context protocol released by Anthropic in May 2024:

Main Features

  • Standardized Interface: Unified access to various tools and resources
  • Modular Design: Enhances system interoperability
  • State Management: Supports stateful multi-agent interactions

⚠️ Important Note

Currently, few systems in the industry integrate both protocols simultaneously. AgentMaster is pioneering work in this field.

Multi-Agent Collaboration Mechanism {#multi-agent-collaboration}

Coordinator Agent Workflow

graph TD
    A[Receive User Query] --> B[Complexity Assessment]
    B --> C{Multi-agent Collaboration Needed?}
    C -->|Yes| D[Task Decomposition]
    C -->|No| E[Direct Route to MCP Client]
    D --> F[Agent Selection]
    F --> G[Parallel/Sequential Execution]
    G --> H[Result Aggregation]
    H --> I[Generate Final Answer]
    E --> I

Specialized Agent Types

The system currently includes four types of specialized agents:

Agent Type Processing Domain Technical Implementation Application Scenarios
IR Agent Information Retrieval Knowledge Base Retrieval Unstructured Content Queries
SQL Agent Database Queries SQL Generation and Execution Structured Data Analysis
Image Agent Image Analysis External Vision APIs Multimodal Content Processing
General Agent Open Domain Queries LLM Reasoning Fallback and General Tasks

Agent Communication Examples

Frontend Example Figure 3a: Frontend Interaction Example

Backend Processing Figure 3c: Backend Processing Flow

Experimental Results and Performance Evaluation {#experimental-results}

Evaluation Methodology

The research team adopted a multi-dimensional evaluation system:

  • Agent Metrics: Task completion rate and accuracy
  • LLM-as-a-Judge: Using large language models to evaluate output quality
  • Human Evaluation: Gold standard for validation benchmarks

Core Performance Indicators

Evaluation Dimension Metric Name Score Description
Semantic Similarity BERTScore F1 96.3% Semantic matching with reference output
Overall Quality G-Eval 87.1% LLM-evaluated comprehensive quality score
Answer Relevancy Answer Relevancy High Score Relevance of answers to questions
Hallucination Detection Hallucination Rate Low Score Rate of false information generation

Complex Query Processing Capability

The system performs excellently when processing complex queries:

Query ID Number of Subproblems Involved Agents Processing Status
Q1 2 General + SQL ✅ Success
Q2 3 SQL + General ✅ Success
Q3 2 SQL + General ✅ Success
Q4 3 SQL + IR + General ✅ Success
Q5 2 SQL + General ✅ Success
Q6 4 IR + General ✅ Success

Validation Method

The research team decomposed complex queries into simple subproblems and submitted them separately for validation, ensuring consistency and accuracy of system outputs.

Query Validation Figure 3b: Complex Query Validation Example

Real-World Application Case Studies {#use-cases}

Case 1: Infrastructure Data Query

User Query: "How many bridges were built in Virginia in total? How many were built in 2019?"

System Processing Flow:

  1. Coordinator agent identifies as complex query
  2. Decomposes into two subproblems
  3. SQL agent queries database
  4. General agent provides background information
  5. Integrates and generates complete answer

Case 2: Multimodal Image Analysis

Application Scenario: Bridge detection and elevation map analysis

Image Processing Example Figure 4: Image Agent Single Query Frontend Example

Technical Implementation:

  • Image agent calls external vision APIs
  • Automatically identifies key information in images
  • Generates structured analysis reports

Case 3: Information Retrieval and Summarization

Information Retrieval Example Figure 5: IR Agent Single Query Frontend Example

Processing Capabilities:

  • Retrieves relevant information from large knowledge bases
  • Intelligent summarization and content integration
  • Provides accurate citations and sources

System Limitations Analysis {#limitations}

Current Challenges

  • Accuracy Dependency: System performance is affected by the quality of underlying LLMs and retrieval corpora
  • Complexity Misjudgment: Occasionally misclassifies simple queries as complex queries
  • Limited Collaboration Depth: The degree of collaboration between agents still has room for improvement
  • Database Scale: Limited database size may lead to insufficient information depth

Technical Limitations

  • LLM Reasoning Limitations: May encounter challenges when synthesizing complex information
  • Evaluation Bias: LLM-as-a-Judge method has potential biases
  • Missing Security Mechanisms: Current framework lacks security guarantees for information storage and usage

⚠️ Improvement Directions

The research team has identified these limitations and will focus on addressing them in future work.

Technical Deployment and Implementation {#deployment}

Deployment Architecture

  • Local Deployment: Supports completely offline operation
  • Cloud Deployment: AWS-based microservice architecture
  • Technology Stack: Flask + Python + OpenAI GPT-4o mini

Data Sources

The system uses public datasets from the Federal Highway Administration (FHWA) for case studies, covering:

  • Bridge infrastructure data
  • Traffic flow statistics
  • Engineering inspection reports

🤔 Frequently Asked Questions {#faq}

Q: What's the difference between AgentMaster and traditional multi-agent systems?

A: AgentMaster's core innovation lies in simultaneously integrating the two latest protocols A2A and MCP, which enables the system to have:

  • More standardized inter-agent communication
  • Stronger modularity and scalability
  • Better state management and context retention capabilities
  • More unified tool and resource access interfaces

Q: How does the system ensure accuracy in multi-agent collaboration?

A: The system adopts multi-layer validation mechanisms:

  • Task Decomposition Validation: Decomposes complex queries into simple subproblems for validation
  • Multi-dimensional Evaluation: Combines BERTScore, G-Eval, and human evaluation
  • Consistency Checking: Compares consistency between subproblem answers and overall responses
  • Error Recovery Mechanism: Automatically retries and repairs when failures are detected

Q: How can ordinary users use this system?

A: The system is designed with user-friendly interaction methods:

  • Natural Language Interaction: No need to learn special commands or syntax
  • Multimodal Input: Supports various input methods including text, images, and voice
  • Intelligent Understanding: Automatically understands user intent and routes to appropriate processing modules
  • Clear Output: Presents results in easily understandable formats

Q: How is the system's scalability?

A: AgentMaster has excellent scalability:

  • Modular Design: New agents can be seamlessly integrated without affecting existing functionality
  • Standardized Interface: Unified communication protocol based on JSON-RPC
  • Flexible Deployment: Supports various deployment methods both locally and in the cloud
  • Open Source Architecture: Convenient for researchers and developers to customize and extend

Q: How does the system perform in real-world applications?

A: According to experimental results, the system performs excellently:

  • High Accuracy: BERTScore F1 reaches 96.3%
  • Strong Consistency: Complex query decomposition and validation show high consistency
  • Wide Applicability: Successfully handles SQL queries, information retrieval, image analysis, and other tasks
  • Stable Performance: Performs stably in both local and cloud deployments

Summary and Outlook {#summary}

AgentMaster represents an important milestone in the development of multi-agent systems, successfully integrating the two cutting-edge protocols A2A and MCP in a unified framework, opening new possibilities for scalable, domain-adaptive conversational AI.

Core Contributions

  1. Technical Innovation: First multi-agent framework to simultaneously integrate A2A and MCP protocols
  2. Architecture Optimization: Unified architecture supporting query decomposition, dynamic routing, and agent orchestration
  3. Practical Value: Complex multimodal task processing through natural language interaction
  4. Performance Validation: System effectiveness proven through rigorous multi-dimensional evaluation

Future Development Directions

  • Security Mechanism Enhancement: Establish comprehensive information security and privacy protection systems
  • Collaboration Depth Improvement: Enhance deep collaboration capabilities between agents
  • Domain Expansion: Support integration of more specialized domain agents
  • Performance Optimization: Continuously improve system accuracy and response speed

🚀 Technical Prospects

AgentMaster provides a powerful technical foundation for building next-generation intelligent assistants and automation systems, with potential to play important roles in research, business, and social services.


Original Paper Link: https://arxiv.org/html/2507.21105v1

Author Information:

  • Callie C. Liao (Stanford University)
  • Duoduo Liao (George Mason University)
  • Sai Surya Gadiraju (George Mason University)

Data Source: Federal Highway Administration (FHWA) public datasets

This article is organized based on the original paper content, aiming to provide readers with a comprehensive technical analysis of the AgentMaster framework.