AgentMaster Multi-Agent Conversational Framework - Multimodal Information Retrieval System Based on A2A and MCP Protocols

🎯 Key Points (TL;DR)
- Innovative Framework: AgentMaster is the first multi-agent system that simultaneously integrates A2A and MCP protocols
- Multimodal Support: Supports intelligent processing of various input formats including text, images, and audio
- High Performance: BERTScore F1 reaches 96.3%, G-Eval score 87.1%
- Practical Value: Users can interact with the system through natural language without technical background
- Open Source Deployment: Supports local and AWS cloud deployment based on Flask microservice architecture
Table of Contents
- What is the AgentMaster Framework
- Core Technical Architecture Analysis
- A2A and MCP Protocol Details
- Multi-Agent Collaboration Mechanism
- Experimental Results and Performance Evaluation
- Real-World Application Case Studies
- System Limitations Analysis
- Technical Deployment and Implementation
- Frequently Asked Questions
- Summary and Outlook
What is the AgentMaster Framework {#what-is-agentmaster}
AgentMaster is a next-generation multi-agent conversational framework jointly developed by Stanford University and George Mason University, which pioneering integrates Anthropic's Model Context Protocol (MCP) and Google's Agent-to-Agent communication protocol (A2A) in a single system.
Core Innovations
- Unified Conversational Interface: Users can interact with the system through natural language without professional technical knowledge
- Dynamic Task Decomposition: Automatically decomposes complex queries into executable subtasks
- Intelligent Routing Mechanism: Automatically selects the most suitable specialized agents based on task characteristics
- Multimodal Processing: Supports various data formats including text, images, charts, and audio
Figure 1: AgentMaster's General Multi-Agent System Framework
💡 Technical Breakthrough
This is the first multi-agent system to simultaneously implement A2A and MCP protocols in a single framework, filling a technical gap in this field.
Core Technical Architecture Analysis {#system-architecture}
AgentMaster adopts a four-layer architecture design, with each layer having clear responsibility divisions:
1. Unified Conversational Interface Layer
- Multimodal Input: Supports text, charts, images, and audio input
- Intelligent Output: Generates text, images, structured data tables, and other formats
- User-Friendly: Chatbot-like interactive experience
2. Multi-Agent Hub
The system contains three levels of agents:
Agent Type | Main Responsibilities | Technical Features |
---|---|---|
Coordinator Agent | Task decomposition, execution coordination | Central controller responsible for overall scheduling |
Domain Agents | Specialized function processing | Can be based on LLM or non-LLM technologies |
General Agents | General reasoning tasks | Each equipped with dedicated LLM |
Figure 2: Case Study System Architecture
3. Multi-Agent AI Protocol Layer
- A2A Protocol: Implements structured communication between agents
- MCP Protocol: Provides unified interface for tool access and context management
4. State Management Layer
- Vector Database: Provides persistent semantic memory
- Context Cache: Rapid storage of session data and intermediate results
A2A and MCP Protocol Details {#protocols-explained}
Agent-to-Agent (A2A) Protocol
The A2A protocol is an inter-agent communication standard launched by Google in May 2025:
Core Functions
- Structured Message Exchange: Standardized communication based on JSON format
- Task Distribution Mechanism: Supports parallel or sequential execution of subtasks
- Shared Understanding Construction: Multi-agent collaboration to solve complex problems
Technical Advantages
{
"message_type": "task_delegation",
"sender": "coordinator_agent",
"receiver": "sql_agent",
"task": "query_bridge_data",
"parameters": {...}
}
Model Context Protocol (MCP)
MCP is a model context protocol released by Anthropic in May 2024:
Main Features
- Standardized Interface: Unified access to various tools and resources
- Modular Design: Enhances system interoperability
- State Management: Supports stateful multi-agent interactions
⚠️ Important Note
Currently, few systems in the industry integrate both protocols simultaneously. AgentMaster is pioneering work in this field.
Multi-Agent Collaboration Mechanism {#multi-agent-collaboration}
Coordinator Agent Workflow
graph TD
A[Receive User Query] --> B[Complexity Assessment]
B --> C{Multi-agent Collaboration Needed?}
C -->|Yes| D[Task Decomposition]
C -->|No| E[Direct Route to MCP Client]
D --> F[Agent Selection]
F --> G[Parallel/Sequential Execution]
G --> H[Result Aggregation]
H --> I[Generate Final Answer]
E --> I
Specialized Agent Types
The system currently includes four types of specialized agents:
Agent Type | Processing Domain | Technical Implementation | Application Scenarios |
---|---|---|---|
IR Agent | Information Retrieval | Knowledge Base Retrieval | Unstructured Content Queries |
SQL Agent | Database Queries | SQL Generation and Execution | Structured Data Analysis |
Image Agent | Image Analysis | External Vision APIs | Multimodal Content Processing |
General Agent | Open Domain Queries | LLM Reasoning | Fallback and General Tasks |
Agent Communication Examples
Figure 3a: Frontend Interaction Example
Figure 3c: Backend Processing Flow
Experimental Results and Performance Evaluation {#experimental-results}
Evaluation Methodology
The research team adopted a multi-dimensional evaluation system:
- Agent Metrics: Task completion rate and accuracy
- LLM-as-a-Judge: Using large language models to evaluate output quality
- Human Evaluation: Gold standard for validation benchmarks
Core Performance Indicators
Evaluation Dimension | Metric Name | Score | Description |
---|---|---|---|
Semantic Similarity | BERTScore F1 | 96.3% | Semantic matching with reference output |
Overall Quality | G-Eval | 87.1% | LLM-evaluated comprehensive quality score |
Answer Relevancy | Answer Relevancy | High Score | Relevance of answers to questions |
Hallucination Detection | Hallucination Rate | Low Score | Rate of false information generation |
Complex Query Processing Capability
The system performs excellently when processing complex queries:
Query ID | Number of Subproblems | Involved Agents | Processing Status |
---|---|---|---|
Q1 | 2 | General + SQL | ✅ Success |
Q2 | 3 | SQL + General | ✅ Success |
Q3 | 2 | SQL + General | ✅ Success |
Q4 | 3 | SQL + IR + General | ✅ Success |
Q5 | 2 | SQL + General | ✅ Success |
Q6 | 4 | IR + General | ✅ Success |
✅ Validation Method
The research team decomposed complex queries into simple subproblems and submitted them separately for validation, ensuring consistency and accuracy of system outputs.
Figure 3b: Complex Query Validation Example
Real-World Application Case Studies {#use-cases}
Case 1: Infrastructure Data Query
User Query: "How many bridges were built in Virginia in total? How many were built in 2019?"
System Processing Flow:
- Coordinator agent identifies as complex query
- Decomposes into two subproblems
- SQL agent queries database
- General agent provides background information
- Integrates and generates complete answer
Case 2: Multimodal Image Analysis
Application Scenario: Bridge detection and elevation map analysis
Figure 4: Image Agent Single Query Frontend Example
Technical Implementation:
- Image agent calls external vision APIs
- Automatically identifies key information in images
- Generates structured analysis reports
Case 3: Information Retrieval and Summarization
Figure 5: IR Agent Single Query Frontend Example
Processing Capabilities:
- Retrieves relevant information from large knowledge bases
- Intelligent summarization and content integration
- Provides accurate citations and sources
System Limitations Analysis {#limitations}
Current Challenges
- Accuracy Dependency: System performance is affected by the quality of underlying LLMs and retrieval corpora
- Complexity Misjudgment: Occasionally misclassifies simple queries as complex queries
- Limited Collaboration Depth: The degree of collaboration between agents still has room for improvement
- Database Scale: Limited database size may lead to insufficient information depth
Technical Limitations
- LLM Reasoning Limitations: May encounter challenges when synthesizing complex information
- Evaluation Bias: LLM-as-a-Judge method has potential biases
- Missing Security Mechanisms: Current framework lacks security guarantees for information storage and usage
⚠️ Improvement Directions
The research team has identified these limitations and will focus on addressing them in future work.
Technical Deployment and Implementation {#deployment}
Deployment Architecture
- Local Deployment: Supports completely offline operation
- Cloud Deployment: AWS-based microservice architecture
- Technology Stack: Flask + Python + OpenAI GPT-4o mini
Data Sources
The system uses public datasets from the Federal Highway Administration (FHWA) for case studies, covering:
- Bridge infrastructure data
- Traffic flow statistics
- Engineering inspection reports
🤔 Frequently Asked Questions {#faq}
Q: What's the difference between AgentMaster and traditional multi-agent systems?
A: AgentMaster's core innovation lies in simultaneously integrating the two latest protocols A2A and MCP, which enables the system to have:
- More standardized inter-agent communication
- Stronger modularity and scalability
- Better state management and context retention capabilities
- More unified tool and resource access interfaces
Q: How does the system ensure accuracy in multi-agent collaboration?
A: The system adopts multi-layer validation mechanisms:
- Task Decomposition Validation: Decomposes complex queries into simple subproblems for validation
- Multi-dimensional Evaluation: Combines BERTScore, G-Eval, and human evaluation
- Consistency Checking: Compares consistency between subproblem answers and overall responses
- Error Recovery Mechanism: Automatically retries and repairs when failures are detected
Q: How can ordinary users use this system?
A: The system is designed with user-friendly interaction methods:
- Natural Language Interaction: No need to learn special commands or syntax
- Multimodal Input: Supports various input methods including text, images, and voice
- Intelligent Understanding: Automatically understands user intent and routes to appropriate processing modules
- Clear Output: Presents results in easily understandable formats
Q: How is the system's scalability?
A: AgentMaster has excellent scalability:
- Modular Design: New agents can be seamlessly integrated without affecting existing functionality
- Standardized Interface: Unified communication protocol based on JSON-RPC
- Flexible Deployment: Supports various deployment methods both locally and in the cloud
- Open Source Architecture: Convenient for researchers and developers to customize and extend
Q: How does the system perform in real-world applications?
A: According to experimental results, the system performs excellently:
- High Accuracy: BERTScore F1 reaches 96.3%
- Strong Consistency: Complex query decomposition and validation show high consistency
- Wide Applicability: Successfully handles SQL queries, information retrieval, image analysis, and other tasks
- Stable Performance: Performs stably in both local and cloud deployments
Summary and Outlook {#summary}
AgentMaster represents an important milestone in the development of multi-agent systems, successfully integrating the two cutting-edge protocols A2A and MCP in a unified framework, opening new possibilities for scalable, domain-adaptive conversational AI.
Core Contributions
- Technical Innovation: First multi-agent framework to simultaneously integrate A2A and MCP protocols
- Architecture Optimization: Unified architecture supporting query decomposition, dynamic routing, and agent orchestration
- Practical Value: Complex multimodal task processing through natural language interaction
- Performance Validation: System effectiveness proven through rigorous multi-dimensional evaluation
Future Development Directions
- Security Mechanism Enhancement: Establish comprehensive information security and privacy protection systems
- Collaboration Depth Improvement: Enhance deep collaboration capabilities between agents
- Domain Expansion: Support integration of more specialized domain agents
- Performance Optimization: Continuously improve system accuracy and response speed
🚀 Technical Prospects
AgentMaster provides a powerful technical foundation for building next-generation intelligent assistants and automation systems, with potential to play important roles in research, business, and social services.
Original Paper Link: https://arxiv.org/html/2507.21105v1
Author Information:
- Callie C. Liao (Stanford University)
- Duoduo Liao (George Mason University)
- Sai Surya Gadiraju (George Mason University)
Data Source: Federal Highway Administration (FHWA) public datasets
This article is organized based on the original paper content, aiming to provide readers with a comprehensive technical analysis of the AgentMaster framework.