MCP Server Development Guide
Overview
Create MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks.
Process
🚀 High-Level Workflow
Creating a high-quality MCP server involves four main phases:
Phase 1: Deep Research and Planning
1.1 Understand Modern MCP Design
API Coverage vs. Workflow Tools:
Balance comprehensive API endpoint coverage with specialized workflow tools. Workflow tools can be more convenient for specific tasks, while comprehensive coverage gives agents flexibility to compose operations. Performance varies by client—some clients benefit from code execution that combines basic tools, while others work better with higher-level workflows. When uncertain, prioritize comprehensive API coverage.
Tool Naming and Discoverability:
Clear, descriptive tool names help agents find the right tools quickly. Use consistent prefixes (e.g., github_create_issue, github_list_repos) and action-oriented naming.
Context Management:
Agents benefit from concise tool descriptions and the ability to filter/paginate results. Design tools that return focused, relevant data. Some clients support code execution which can help agents filter and process data efficiently.
Actionable Error Messages:
Error messages should guide agents toward solutions with specific suggestions and next steps.
1.2 Study MCP Protocol Documentation
Navigate the MCP specification:
Start with the sitemap to find relevant pages: INLINECODE2
Then fetch specific pages with .md suffix for markdown format (e.g., https://modelcontextprotocol.io/specification/draft.md).
Key pages to review:
- - Specification overview and architecture
- Transport mechanisms (streamable HTTP, stdio)
- Tool, resource, and prompt definitions
1.3 Study Framework Documentation
Recommended stack:
- - Language: TypeScript (high-quality SDK support and good compatibility in many execution environments e.g. MCPB. Plus AI models are good at generating TypeScript code, benefiting from its broad usage, static typing and good linting tools)
- Transport: Streamable HTTP for remote servers, using stateless JSON (simpler to scale and maintain, as opposed to stateful sessions and streaming responses). stdio for local servers.
Load framework documentation:
For TypeScript (recommended):
- - TypeScript SDK: Use WebFetch to load INLINECODE5
- ⚡ TypeScript Guide - TypeScript patterns and examples
For Python:
- - Python SDK: Use WebFetch to load INLINECODE6
- 🐍 Python Guide - Python patterns and examples
1.4 Plan Your Implementation
Understand the API:
Review the service's API documentation to identify key endpoints, authentication requirements, and data models. Use web search and WebFetch as needed.
Tool Selection:
Prioritize comprehensive API coverage. List endpoints to implement, starting with the most common operations.
Phase 2: Implementation
2.1 Set Up Project Structure
See language-specific guides for project setup:
2.2 Implement Core Infrastructure
Create shared utilities:
- - API client with authentication
- Error handling helpers
- Response formatting (JSON/Markdown)
- Pagination support
2.3 Implement Tools
For each tool:
Input Schema:
- - Use Zod (TypeScript) or Pydantic (Python)
- Include constraints and clear descriptions
- Add examples in field descriptions
Output Schema:
- - Define
outputSchema where possible for structured data - Use
structuredContent in tool responses (TypeScript SDK feature) - Helps clients understand and process tool outputs
Tool Description:
- - Concise summary of functionality
- Parameter descriptions
- Return type schema
Implementation:
- - Async/await for I/O operations
- Proper error handling with actionable messages
- Support pagination where applicable
- Return both text content and structured data when using modern SDKs
Annotations:
- -
readOnlyHint: true/false - INLINECODE10 : true/false
- INLINECODE11 : true/false
- INLINECODE12 : true/false
Phase 3: Review and Test
3.1 Code Quality
Review for:
- - No duplicated code (DRY principle)
- Consistent error handling
- Full type coverage
- Clear tool descriptions
3.2 Build and Test
TypeScript:
- - Run
npm run build to verify compilation - Test with MCP Inspector: INLINECODE14
Python:
- - Verify syntax: INLINECODE15
- Test with MCP Inspector
See language-specific guides for detailed testing approaches and quality checklists.
Phase 4: Create Evaluations
After implementing your MCP server, create comprehensive evaluations to test its effectiveness.
Load ✅ Evaluation Guide for complete evaluation guidelines.
4.1 Understand Evaluation Purpose
Use evaluations to test whether LLMs can effectively use your MCP server to answer realistic, complex questions.
4.2 Create 10 Evaluation Questions
To create effective evaluations, follow the process outlined in the evaluation guide:
- 1. Tool Inspection: List available tools and understand their capabilities
- Content Exploration: Use READ-ONLY operations to explore available data
- Question Generation: Create 10 complex, realistic questions
- Answer Verification: Solve each question yourself to verify answers
4.3 Evaluation Requirements
Ensure each question is:
- - Independent: Not dependent on other questions
- Read-only: Only non-destructive operations required
- Complex: Requiring multiple tool calls and deep exploration
- Realistic: Based on real use cases humans would care about
- Verifiable: Single, clear answer that can be verified by string comparison
- Stable: Answer won't change over time
4.4 Output Format
Create an XML file with this structure:
CODEBLOCK0
Reference Files
📚 Documentation Library
Load these resources as needed during development:
Core MCP Documentation (Load First)
- - MCP Protocol: Start with sitemap at
https://modelcontextprotocol.io/sitemap.xml, then fetch specific pages with .md suffix - 📋 MCP Best Practices - Universal MCP guidelines including:
- Server and tool naming conventions
- Response format guidelines (JSON vs Markdown)
- Pagination best practices
- Transport selection (streamable HTTP vs stdio)
- Security and error handling standards
SDK Documentation (Load During Phase 1/2)
- - Python SDK: Fetch from INLINECODE18
- TypeScript SDK: Fetch from INLINECODE19
Language-Specific Implementation Guides (Load During Phase 2)
- Server initialization patterns
- Pydantic model examples
- Tool registration with
@mcp.tool
- Complete working examples
- Quality checklist
- Project structure
- Zod schema patterns
- Tool registration with
server.registerTool
- Complete working examples
- Quality checklist
Evaluation Guide (Load During Phase 4)
- Question creation guidelines
- Answer verification strategies
- XML format specifications
- Example questions and answers
- Running an evaluation with the provided scripts
MCP 服务器开发指南
概述
创建 MCP(模型上下文协议)服务器,使大语言模型能够通过精心设计的工具与外部服务进行交互。MCP 服务器的质量取决于其帮助大语言模型完成实际任务的能力。
流程
🚀 高级工作流
创建高质量的 MCP 服务器包含四个主要阶段:
阶段一:深入调研与规划
1.1 理解现代 MCP 设计
API 覆盖 vs. 工作流工具:
在全面的 API 端点覆盖与专门的工作流工具之间取得平衡。工作流工具对于特定任务可能更方便,而全面覆盖则赋予智能体组合操作的灵活性。不同客户端的性能表现各异——某些客户端受益于结合基础工具的代码执行,而其他客户端则更适合高级工作流。不确定时,优先考虑全面的 API 覆盖。
工具命名与可发现性:
清晰、描述性的工具名称有助于智能体快速找到合适的工具。使用一致的前缀(例如 githubcreateissue、githublistrepos)和面向操作的命名方式。
上下文管理:
简洁的工具描述以及过滤/分页结果的能力对智能体有益。设计能够返回聚焦、相关数据的工具。某些客户端支持代码执行,这可以帮助智能体高效地过滤和处理数据。
可操作错误信息:
错误信息应通过具体建议和后续步骤引导智能体找到解决方案。
1.2 学习 MCP 协议文档
浏览 MCP 规范:
从站点地图开始查找相关页面:https://modelcontextprotocol.io/sitemap.xml
然后获取带有 .md 后缀的特定页面以获取 Markdown 格式(例如 https://modelcontextprotocol.io/specification/draft.md)。
需要查阅的关键页面:
- - 规范概述与架构
- 传输机制(可流式 HTTP、stdio)
- 工具、资源和提示词定义
1.3 学习框架文档
推荐技术栈:
- - 语言:TypeScript(高质量的 SDK 支持,在许多执行环境如 MCPB 中具有良好的兼容性。此外,AI 模型擅长生成 TypeScript 代码,得益于其广泛使用、静态类型和良好的 lint 工具)
- 传输:远程服务器使用可流式 HTTP,采用无状态 JSON(相比有状态会话和流式响应,更易于扩展和维护)。本地服务器使用 stdio。
加载框架文档:
对于 TypeScript(推荐):
- - TypeScript SDK:使用 WebFetch 加载 https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md
- ⚡ TypeScript 指南 - TypeScript 模式与示例
对于 Python:
- - Python SDK:使用 WebFetch 加载 https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
- 🐍 Python 指南 - Python 模式与示例
1.4 规划实现方案
理解 API:
查阅服务的 API 文档,识别关键端点、认证要求和数据模型。根据需要,使用网络搜索和 WebFetch。
工具选择:
优先考虑全面的 API 覆盖。列出需要实现的端点,从最常见的操作开始。
阶段二:实现
2.1 设置项目结构
请参阅特定语言的指南进行项目设置:
2.2 实现核心基础设施
创建共享工具:
- - 带认证的 API 客户端
- 错误处理辅助函数
- 响应格式化(JSON/Markdown)
- 分页支持
2.3 实现工具
对于每个工具:
输入模式:
- - 使用 Zod(TypeScript)或 Pydantic(Python)
- 包含约束条件和清晰描述
- 在字段描述中添加示例
输出模式:
- - 尽可能定义 outputSchema 以获取结构化数据
- 在工具响应中使用 structuredContent(TypeScript SDK 功能)
- 帮助客户端理解和处理工具输出
工具描述:
实现:
- - 对 I/O 操作使用 async/await
- 使用可操作消息进行适当的错误处理
- 在适用情况下支持分页
- 使用现代 SDK 时,同时返回文本内容和结构化数据
注解:
- - readOnlyHint:true/false
- destructiveHint:true/false
- idempotentHint:true/false
- openWorldHint:true/false
阶段三:审查与测试
3.1 代码质量
审查以下内容:
- - 无重复代码(DRY 原则)
- 一致的错误处理
- 完整的类型覆盖
- 清晰的工具描述
3.2 构建与测试
TypeScript:
- - 运行 npm run build 验证编译
- 使用 MCP Inspector 测试:npx @modelcontextprotocol/inspector
Python:
- - 验证语法:python -m pycompile yourserver.py
- 使用 MCP Inspector 测试
有关详细的测试方法和质量检查清单,请参阅特定语言的指南。
阶段四:创建评估
实现 MCP 服务器后,创建全面的评估来测试其有效性。
加载 ✅ 评估指南 获取完整的评估指南。
4.1 理解评估目的
使用评估来测试大语言模型是否能够有效使用您的 MCP 服务器来回答真实、复杂的问题。
4.2 创建 10 个评估问题
要创建有效的评估,请遵循评估指南中概述的流程:
- 1. 工具检查:列出可用工具并了解其能力
- 内容探索:使用只读操作探索可用数据
- 问题生成:创建 10 个复杂、真实的问题
- 答案验证:亲自解答每个问题以验证答案
4.3 评估要求
确保每个问题:
- - 独立:不依赖于其他问题
- 只读:仅需要非破坏性操作
- 复杂:需要多次工具调用和深入探索
- 真实:基于人类关心的实际用例
- 可验证:可通过字符串比较验证的单一、清晰的答案
- 稳定:答案不会随时间变化
4.4 输出格式
创建具有以下结构的 XML 文件:
xml
查找关于使用动物代号进行 AI 模型发布的讨论。某个模型需要一种使用 ASL-X 格式的特定安全标识。以斑点野猫命名的模型正在确定的 X 数字是多少?
3
参考文件
📚 文档库
在开发过程中根据需要加载这些资源:
核心 MCP 文档(优先加载)
- - MCP 协议:从 https://modelcontextprotocol.io/sitemap.xml 的站点地图开始,然后获取带有 .md 后缀的特定页面
- 📋 MCP 最佳实践 - 通用 MCP 指南,包括:
- 服务器和工具命名约定
- 响应格式指南(JSON vs Markdown)
- 分页最佳实践
- 传输选择(可流式 HTTP vs stdio)
- 安全和错误处理标准
SDK 文档(阶段一/二加载)
- - Python SDK:从 https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md 获取
- TypeScript SDK:从 https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md 获取
特定语言实现指南(阶段二加载)
- 服务器初始化模式
- Pydantic 模型示例
- 使用 @mcp.tool 注册工具
- 完整的工作示例
- 质量检查清单
- 项目结构
- Zod 模式模式
- 使用 server.registerTool 注册工具
- 完整的工作示例
- 质量检查清单
评估指南(阶段四加载)
- 问题创建指南
- 答案验证策略
- XML 格式规范
- 示例问题和答案
- 使用