AgentMaster Multi-Agent Conversational Framework - Multimodal Information Retrieval System Based on A2A and MCP Protocols

🎯 Key Points (TL;DR)

Innovative Framework: AgentMaster is the first multi-agent system that simultaneously integrates A2A and MCP protocols
Multimodal Support: Supports intelligent processing of various input formats including text, images, and audio
High Performance: BERTScore F1 reaches 96.3%, G-Eval score 87.1%
Practical Value: Users can interact with the system through natural language without technical background
Open Source Deployment: Supports local and AWS cloud deployment based on Flask microservice architecture

What is the AgentMaster Framework
Core Technical Architecture Analysis
A2A and MCP Protocol Details
Multi-Agent Collaboration Mechanism
Experimental Results and Performance Evaluation
Real-World Application Case Studies
System Limitations Analysis
Technical Deployment and Implementation
Frequently Asked Questions
Summary and Outlook

What is the AgentMaster Framework {#what-is-agentmaster}

AgentMaster is a next-generation multi-agent conversational framework jointly developed by Stanford University and George Mason University, which pioneering integrates Anthropic's Model Context Protocol (MCP) and Google's Agent-to-Agent communication protocol (A2A) in a single system.

Core Innovations

Unified Conversational Interface: Users can interact with the system through natural language without professional technical knowledge
Dynamic Task Decomposition: Automatically decomposes complex queries into executable subtasks
Intelligent Routing Mechanism: Automatically selects the most suitable specialized agents based on task characteristics
Multimodal Processing: Supports various data formats including text, images, charts, and audio

AgentMaster Overall Architecture Figure 1: AgentMaster's General Multi-Agent System Framework

💡 Technical Breakthrough

This is the first multi-agent system to simultaneously implement A2A and MCP protocols in a single framework, filling a technical gap in this field.

Core Technical Architecture Analysis {#system-architecture}

AgentMaster adopts a four-layer architecture design, with each layer having clear responsibility divisions:

1. Unified Conversational Interface Layer

Multimodal Input: Supports text, charts, images, and audio input
Intelligent Output: Generates text, images, structured data tables, and other formats
User-Friendly: Chatbot-like interactive experience

2. Multi-Agent Hub

The system contains three levels of agents:

Agent Type	Main Responsibilities	Technical Features
Coordinator Agent	Task decomposition, execution coordination	Central controller responsible for overall scheduling
Domain Agents	Specialized function processing	Can be based on LLM or non-LLM technologies
General Agents	General reasoning tasks	Each equipped with dedicated LLM

System Architecture Diagram Figure 2: Case Study System Architecture

3. Multi-Agent AI Protocol Layer

A2A Protocol: Implements structured communication between agents
MCP Protocol: Provides unified interface for tool access and context management

4. State Management Layer

Vector Database: Provides persistent semantic memory
Context Cache: Rapid storage of session data and intermediate results

A2A and MCP Protocol Details {#protocols-explained}

Agent-to-Agent (A2A) Protocol

The A2A protocol is an inter-agent communication standard launched by Google in May 2025:

Core Functions

Structured Message Exchange: Standardized communication based on JSON format
Task Distribution Mechanism: Supports parallel or sequential execution of subtasks
Shared Understanding Construction: Multi-agent collaboration to solve complex problems

Technical Advantages

{
  "message_type": "task_delegation",
  "sender": "coordinator_agent",
  "receiver": "sql_agent",
  "task": "query_bridge_data",
  "parameters": {...}
}

Model Context Protocol (MCP)

MCP is a model context protocol released by Anthropic in May 2024:

Main Features

Standardized Interface: Unified access to various tools and resources
Modular Design: Enhances system interoperability
State Management: Supports stateful multi-agent interactions

⚠️ Important Note

Currently, few systems in the industry integrate both protocols simultaneously. AgentMaster is pioneering work in this field.

Multi-Agent Collaboration Mechanism {#multi-agent-collaboration}

Coordinator Agent Workflow

graph TD
    A[Receive User Query] --> B[Complexity Assessment]
    B --> C{Multi-agent Collaboration Needed?}
    C -->|Yes| D[Task Decomposition]
    C -->|No| E[Direct Route to MCP Client]
    D --> F[Agent Selection]
    F --> G[Parallel/Sequential Execution]
    G --> H[Result Aggregation]
    H --> I[Generate Final Answer]
    E --> I

Specialized Agent Types

The system currently includes four types of specialized agents:

Agent Type	Processing Domain	Technical Implementation	Application Scenarios
IR Agent	Information Retrieval	Knowledge Base Retrieval	Unstructured Content Queries
SQL Agent	Database Queries	SQL Generation and Execution	Structured Data Analysis
Image Agent	Image Analysis	External Vision APIs	Multimodal Content Processing
General Agent	Open Domain Queries	LLM Reasoning	Fallback and General Tasks

Agent Communication Examples

Frontend Example Figure 3a: Frontend Interaction Example

Figure 3c: Backend Processing Flow

Experimental Results and Performance Evaluation {#experimental-results}

Evaluation Methodology

The research team adopted a multi-dimensional evaluation system:

Agent Metrics: Task completion rate and accuracy
LLM-as-a-Judge: Using large language models to evaluate output quality
Human Evaluation: Gold standard for validation benchmarks

Core Performance Indicators

Evaluation Dimension	Metric Name	Score	Description
Semantic Similarity	BERTScore F1	96.3%	Semantic matching with reference output
Overall Quality	G-Eval	87.1%	LLM-evaluated comprehensive quality score
Answer Relevancy	Answer Relevancy	High Score	Relevance of answers to questions
Hallucination Detection	Hallucination Rate	Low Score	Rate of false information generation

Complex Query Processing Capability

The system performs excellently when processing complex queries:

Query ID	Number of Subproblems	Involved Agents	Processing Status
Q1	2	General + SQL	✅ Success
Q2	3	SQL + General	✅ Success
Q3	2	SQL + General	✅ Success
Q4	3	SQL + IR + General	✅ Success
Q5	2	SQL + General	✅ Success
Q6	4	IR + General	✅ Success

✅ Validation Method

The research team decomposed complex queries into simple subproblems and submitted them separately for validation, ensuring consistency and accuracy of system outputs.

Figure 3b: Complex Query Validation Example

Real-World Application Case Studies {#use-cases}

Case 1: Infrastructure Data Query

User Query: "How many bridges were built in Virginia in total? How many were built in 2019?"

System Processing Flow:

Coordinator agent identifies as complex query
Decomposes into two subproblems
SQL agent queries database
General agent provides background information
Integrates and generates complete answer

Case 2: Multimodal Image Analysis

Application Scenario: Bridge detection and elevation map analysis

Image Processing Example Figure 4: Image Agent Single Query Frontend Example

Technical Implementation:

Image agent calls external vision APIs
Automatically identifies key information in images
Generates structured analysis reports

Case 3: Information Retrieval and Summarization

Information Retrieval Example Figure 5: IR Agent Single Query Frontend Example

Processing Capabilities:

Retrieves relevant information from large knowledge bases
Intelligent summarization and content integration
Provides accurate citations and sources

System Limitations Analysis {#limitations}

Current Challenges

Accuracy Dependency: System performance is affected by the quality of underlying LLMs and retrieval corpora
Complexity Misjudgment: Occasionally misclassifies simple queries as complex queries
Limited Collaboration Depth: The degree of collaboration between agents still has room for improvement
Database Scale: Limited database size may lead to insufficient information depth

Technical Limitations

LLM Reasoning Limitations: May encounter challenges when synthesizing complex information
Evaluation Bias: LLM-as-a-Judge method has potential biases
Missing Security Mechanisms: Current framework lacks security guarantees for information storage and usage

⚠️ Improvement Directions

The research team has identified these limitations and will focus on addressing them in future work.

Technical Deployment and Implementation {#deployment}

Deployment Architecture

Local Deployment: Supports completely offline operation
Cloud Deployment: AWS-based microservice architecture
Technology Stack: Flask + Python + OpenAI GPT-4o mini

Data Sources

The system uses public datasets from the Federal Highway Administration (FHWA) for case studies, covering:

Bridge infrastructure data
Traffic flow statistics
Engineering inspection reports

🤔 Frequently Asked Questions {#faq}

Q: What's the difference between AgentMaster and traditional multi-agent systems?

A: AgentMaster's core innovation lies in simultaneously integrating the two latest protocols A2A and MCP, which enables the system to have:

More standardized inter-agent communication
Stronger modularity and scalability
Better state management and context retention capabilities
More unified tool and resource access interfaces

Q: How does the system ensure accuracy in multi-agent collaboration?

A: The system adopts multi-layer validation mechanisms:

Task Decomposition Validation: Decomposes complex queries into simple subproblems for validation
Multi-dimensional Evaluation: Combines BERTScore, G-Eval, and human evaluation
Consistency Checking: Compares consistency between subproblem answers and overall responses
Error Recovery Mechanism: Automatically retries and repairs when failures are detected

Q: How can ordinary users use this system?

A: The system is designed with user-friendly interaction methods:

Natural Language Interaction: No need to learn special commands or syntax
Multimodal Input: Supports various input methods including text, images, and voice
Intelligent Understanding: Automatically understands user intent and routes to appropriate processing modules
Clear Output: Presents results in easily understandable formats

Q: How is the system's scalability?

A: AgentMaster has excellent scalability:

Modular Design: New agents can be seamlessly integrated without affecting existing functionality
Standardized Interface: Unified communication protocol based on JSON-RPC
Flexible Deployment: Supports various deployment methods both locally and in the cloud
Open Source Architecture: Convenient for researchers and developers to customize and extend

Q: How does the system perform in real-world applications?

A: According to experimental results, the system performs excellently:

High Accuracy: BERTScore F1 reaches 96.3%
Strong Consistency: Complex query decomposition and validation show high consistency
Wide Applicability: Successfully handles SQL queries, information retrieval, image analysis, and other tasks
Stable Performance: Performs stably in both local and cloud deployments

Summary and Outlook {#summary}

AgentMaster represents an important milestone in the development of multi-agent systems, successfully integrating the two cutting-edge protocols A2A and MCP in a unified framework, opening new possibilities for scalable, domain-adaptive conversational AI.

Core Contributions

Technical Innovation: First multi-agent framework to simultaneously integrate A2A and MCP protocols
Architecture Optimization: Unified architecture supporting query decomposition, dynamic routing, and agent orchestration
Practical Value: Complex multimodal task processing through natural language interaction
Performance Validation: System effectiveness proven through rigorous multi-dimensional evaluation

Future Development Directions

Security Mechanism Enhancement: Establish comprehensive information security and privacy protection systems
Collaboration Depth Improvement: Enhance deep collaboration capabilities between agents
Domain Expansion: Support integration of more specialized domain agents
Performance Optimization: Continuously improve system accuracy and response speed

🚀 Technical Prospects

AgentMaster provides a powerful technical foundation for building next-generation intelligent assistants and automation systems, with potential to play important roles in research, business, and social services.

Original Paper Link: https://arxiv.org/html/2507.21105v1

Author Information:

Callie C. Liao (Stanford University)
Duoduo Liao (George Mason University)
Sai Surya Gadiraju (George Mason University)

Data Source: Federal Highway Administration (FHWA) public datasets

This article is organized based on the original paper content, aiming to provide readers with a comprehensive technical analysis of the AgentMaster framework.