Project Introduction
This project demonstrates how to build an intelligent agent that supports MCP (Model Context Protocol) using the AG2 framework (formerly AutoGen), and communicates through the A2A (Agent2Agent) protocol for standardized communication.
The core features of this project include:
- MCP Tool Integration: Access various external tools and capabilities through the MCP protocol
- YouTube Subtitle Processing: Specialized intelligent agent for downloading and analyzing YouTube video subtitles
- A2A Protocol Support: Provides standardized inter-agent communication interface
- Real-time Streaming Processing: Supports real-time status updates during task execution
- Cross-framework Compatibility: Demonstrates interoperability between different agent frameworks
How to Run
1. Clone the Code
git clone https://github.com/sing1ee/a2a-mcp-ag2-sample.git
cd a2a-mcp-ag2-sample
2. Environment Setup
Use the uv package manager to create a virtual environment and install dependencies:
# Create virtual environment
uv venv
# Activate virtual environment and sync dependencies
uv sync
3. Set Environment Variables
Create a .env
file and add your OpenAI API key:
echo "OPENAI_API_KEY=your_api_key_here" > .env
4. Install MCP YouTube Tool
uv tool install git+https://github.com/sparfenyuk/mcp-youtube
5. Run the Agent
# Run with default configuration
uv run .
# Custom host and port
uv run . --host 0.0.0.0 --port 8080
6. Debugging and Testing
Refer to A2A Inspector for debugging. A2A Inspector is a powerful tool specifically designed for debugging A2A applications, which can help you:
- Monitor inter-agent communication
- Inspect A2A protocol messages
- Debug task execution flows
- Validate agent response formats
Example Usage
After starting the agent, you can send the following request to test the YouTube subtitle functionality:
Summarize this video: https://www.youtube.com/watch?v=kQmXtrmQ5Zg
Project Flow Sequence Diagram
sequenceDiagram
participant Client as A2A Client
participant Server as A2A Server
participant Agent as AG2 Agent
participant MCP as MCP Server
participant YouTube as YouTube MCP Tool
Client->>Server: Send task request
Server->>Agent: Forward query to AG2 agent
Note over Server,Agent: Real-time status updates (streaming)
Agent->>MCP: Request available tool list
MCP->>Agent: Return tool definitions
Agent->>Agent: LLM decides to use YouTube tool
Agent->>MCP: Send tool execution request
MCP->>YouTube: Call YouTube subtitle download tool
YouTube->>YouTube: Download video subtitles
YouTube->>MCP: Return subtitle data
MCP->>Agent: Return tool execution result
Agent->>Agent: LLM processes subtitle data and generates response
Agent->>Server: Return complete response
Server->>Client: Respond with task result
Technical Architecture
Core Components
- YoutubeMCPAgent: Core agent implementation based on AG2 AssistantAgent
- AG2AgentExecutor: A2A protocol adapter that handles task execution and event queues
- MCP Tool Integration: Connects to MCP server through stdio client
- A2A Server: Provides standardized agent communication interface
Key Features
- Response Models: Uses Pydantic models to ensure structured output
- Asynchronous Processing: Supports concurrent task processing and streaming responses
- Error Handling: Complete error capture and recovery mechanisms
- Tool Registration: Dynamic registration and management of MCP tools
Summary
With the rapid development and proliferation of AI agent products, more and more agent frameworks and solutions have emerged in the market, such as LangGraph, CrewAI, AG2, etc. Each framework has its unique advantages and applicable scenarios, but this also brings challenges in interoperability.
The Important Significance of A2A Protocol:
- Standardized Communication: The A2A protocol serves as a universal language for inter-agent communication, eliminating barriers between different frameworks
- Ecosystem Interconnection: Enables agents from different technology stacks to collaborate seamlessly, forming a more powerful AI ecosystem
- Reduced Integration Costs: Developers no longer need to develop separate adapters for each framework, greatly reducing the complexity of system integration
- Promoting Innovation: Through standardized protocols, developers can focus on improving agent capabilities rather than protocol adaptation
- Future Scalability: Lays a solid foundation for building complex multi-agent systems
This project demonstrates that the A2A protocol will become an important bridge connecting the AI agent ecosystem, driving the entire industry towards a more open and interconnected direction.