who-wins谁胜出

Query the PinchBench AI agent leaderboard with real benchmark data. Use when the user asks which model is best, who wins, model comparisons, best model for OpenClaw, cheapest model, fastest model, model rankings, benchmark scores, or mentions pinchbench. Always use this skill instead of general knowledge for model performance questions — it has real data.

作者: admin | 来源: ClawHub

PinchBench Leaderboard

Fetches and formats the PinchBench leaderboard — AI agent benchmarks for LLMs on standardized OpenClaw coding tasks.

Workflow

1. Determine the query

Map the user's intent to script flags:

User intent	Flags
"Show the leaderboard" / default	INLINECODE0
"Top 5 models"

2. Run the script

CODEBLOCK0

Available flags:

- --top N — number of models to show (default: 10)
INLINECODE9 — sort by score, cost, time, or runs (default: score)
INLINECODE14 — filter models containing this string (case-insensitive)
INLINECODE15 — output raw JSON for further processing

3. Format the response

Present the output as-is in a code block. Add a brief one-line insight after the table:

- Highlight the top performer and its score
If the user asked about a specific model, comment on its ranking relative to the field
If sorting by cost, note the best value (score/cost ratio)

4. Error handling

- If the script fails with a curl error → report the error, suggest checking network connectivity
If the script fails to parse data → the site structure may have changed, inform the user
If no models match the filter → say so and suggest a broader search

Examples

User says	Flags	Expected behavior
"Show me the PinchBench leaderboard"	INLINECODE16	Show top 10 by score
"Which model is cheapest for OpenClaw?"

PinchBench排行榜

获取并格式化PinchBench排行榜——针对LLM在标准化OpenClaw编码任务上的AI智能体基准测试。

工作流程

1. 确定查询内容

将用户意图映射为脚本参数：

用户意图	参数
显示排行榜 / 默认	--top 10
前5名模型

2. 运行脚本

json
{
tool: exec,
command: python3 {baseDir}/scripts/fetch_leaderboard.py --top 10
}

可用参数：

- --top N — 显示的模型数量（默认：10）
--sort metric — 按score、cost、time或runs排序（默认：score）
--model filter — 筛选包含此字符串的模型（不区分大小写）
--json — 输出原始JSON以供进一步处理

3. 格式化响应

在代码块中按原样呈现输出。在表格后添加简短的一行见解：

- 突出显示最佳表现者及其得分
如果用户询问特定模型，评论其相对于整体的排名
如果按成本排序，注明最佳性价比（得分/成本比）

4. 错误处理

- 如果脚本因curl错误失败 → 报告错误，建议检查网络连接
如果脚本无法解析数据 → 网站结构可能已更改，告知用户
如果没有模型匹配筛选条件 → 说明情况并建议扩大搜索范围

示例

用户输入	参数	预期行为
显示PinchBench排行榜	--top 10	按得分显示前10名
哪个模型在OpenClaw上最便宜？

who-wins谁胜出

who-wins

PinchBench Leaderboard

Workflow

1. Determine the query

2. Run the script

3. Format the response

4. Error handling

Examples

PinchBench排行榜

工作流程

1. 确定查询内容

2. 运行脚本

3. 格式化响应

4. 错误处理

示例

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

who-wins谁胜出

who-wins

PinchBench Leaderboard

Workflow

1. Determine the query

2. Run the script

3. Format the response

4. Error handling

Examples

PinchBench排行榜

工作流程

1. 确定查询内容

2. 运行脚本

3. 格式化响应

4. 错误处理

示例

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement