ODPS (MaxCompute) Data Query

Setup (First-time only)

1. Copy the credential template and fill in your values:

CODEBLOCK0

2. Activate your Python environment and install dependency:

CODEBLOCK1

Executing Commands

Activate your Python environment first, then run all commands from the project root with:

CODEBLOCK2

List tables

CODEBLOCK3

Filter by name:
CODEBLOCK4

Get table schema

CODEBLOCK5

Execute SQL query

CODEBLOCK6

Default limit is 100 rows.

Workflow for Data Tasks

Follow this pattern when the user asks about ODPS data:

1. Discover — If the table name is unknown, run --list-tables --pattern <keyword> to find it.
Inspect — Run --describe <table> to understand columns, types, and partition structure.
Query — Construct the SQL and run --query. Always add a partition filter (WHERE dt = '...') for partitioned tables to avoid full scans.
Present — Summarize the results clearly for the user.

ODPS SQL Key Differences from Standard SQL

Feature	Standard SQL	ODPS SQL
String concat	INLINECODE4	INLINECODE5
Current time

Partition filter is required for partitioned tables (partition column is usually dt):
CODEBLOCK7

See mcp-odps/references/odps_sql_guide.md for a full SQL reference.

Error Handling

- pyodps not found → Run install command in Setup step above
Missing credentials → Check that mcp-odps/.env exists and all four fields are filled in
Table not found → Use --list-tables --pattern to find the correct name
SQL syntax error → Check the ODPS SQL differences table above; avoid MySQL/PostgreSQL-specific syntax

设置（仅首次使用）

1. 复制凭证模板并填写你的值：

bash cd mcp-odps/ cp config.example.env .env # 编辑 .env 文件，填入你的阿里云凭证

2. 激活你的 Python 环境并安装依赖：

bash # conda 用户： conda activate <你的环境名> # venv 用户： source .venv/bin/activate

pip install pyodps

执行命令

首先激活你的 Python 环境，然后在项目根目录下运行所有命令：

bash
SCRIPT=mcp-odps/scripts/odps_helper.py

列出表

bash
python $SCRIPT --list-tables

按名称筛选：
bash
python $SCRIPT --list-tables --pattern <关键词>

获取表结构

bash
python $SCRIPT --describe <表名>

执行 SQL 查询

bash
python $SCRIPT --query [--limit <行数>]

默认限制为 100 行。

数据处理任务工作流程

当用户询问 ODPS 数据时，请遵循以下模式：

1. 发现 — 如果表名未知，运行 --list-tables --pattern <关键词> 来查找。
检查 — 运行 --describe <表> 来了解列、类型和分区结构。
查询 — 构建 SQL 并运行 --query。对于分区表，始终添加分区过滤条件（WHERE dt = ...）以避免全表扫描。
呈现 — 清晰地向用户总结结果。

ODPS SQL 与标准 SQL 的主要区别

特性	标准 SQL	ODPS SQL
字符串拼接	a \	\	b	CONCAT(a, b)
当前时间

对于分区表，必须添加分区过滤条件（分区列通常是 dt）：
sql
SELECT * FROM table_name WHERE dt = 2024-01-01 LIMIT 100

完整的 SQL 参考请参见 mcp-odps/references/odpssqlguide.md。

错误处理

- pyodps 未找到 → 运行上面设置步骤中的安装命令
缺少凭证 → 检查 mcp-odps/.env 文件是否存在，并且所有四个字段都已填写
表未找到 → 使用 --list-tables --pattern 查找正确的名称
SQL 语法错误 → 查看上面的 ODPS SQL 区别表；避免使用 MySQL/PostgreSQL 特有的语法

ODPS (MaxCompute) Data QueryODPS数据查询