Avenir-Web
What this skill does
This skill operates Avenir-Web for reliable web-task execution and iteration.
Responsibilities:
- - run single tasks and batch tasks
- choose mode (
headless / headed / demo) - improve instruction quality before execution
- analyze run outputs and recommend the next best change
- execute one atomic action without strategy/checklist overhead
- read the current page by screenshoting it and asking the main model a question
Use this skill for requests like:
- - run a task on a website
- run a task list and summarize outcomes
- improve success rate with better instructions/config
Canonical entrypoints
Single task:
CODEBLOCK0
Atomic action:
CODEBLOCK1
Read page:
CODEBLOCK2
Batch:
CODEBLOCK3
Prefer these scripts over ad-hoc commands.
Quick usage example
Single task:
CODEBLOCK4
Batch:
CODEBLOCK5
Atomic action:
CODEBLOCK6
Read page:
CODEBLOCK7
Run modes
| mode | behavior | best for |
|---|
| INLINECODE3 | no visible browser window | fast, reproducible runs and large batch jobs |
| INLINECODE4 |
visible browser window | manual observation without demo overlay |
|
demo | visible window + overlay/dashboard controls | live debugging and demonstrations |
Notes:
- - if mode is missing, use INLINECODE6
- INLINECODE7 improves observability, not model intelligence
Mode selection:
- 1. benchmark/batch -> INLINECODE8
- visual debugging -> INLINECODE9
- demo/control flow visibility -> INLINECODE10
Instruction design
INLINECODE11 should include:
- 1. objective
- constraints
- completion condition
Template:
Keep it single-goal, specific, and verifiable.
Single-task workflow
Input:
- - INLINECODE13
- INLINECODE14
- optional
mode, task-id, INLINECODE17
Steps:
- 1. check environment and API key
- validate instruction quality
- run INLINECODE18
- inspect outputs
- report status + cause + next action
Recommended report fields:
- - INLINECODE19
- status:
success / partial / INLINECODE22 - evidence summary
- one-line cause
- one recommended next step
Atomic action workflow
Use scripts/atomic_action.py when you need exactly one browser operation and do not want strategist/checklist generation.
Typical uses:
- - one click
- one type
- one goto
- one scroll
Properties:
- - disables strategy generation
- disables checklist generation
- executes exactly one action
- returns structured JSON with result, URL, screenshot path, and output directory
Read-page workflow
Use scripts/read_page.py when you want to inspect the current page by screenshot and ask the main model a direct question.
Properties:
- - opens the page
- captures a screenshot
- sends the screenshot plus page metadata to the main model
- returns structured JSON with the answer and screenshot path
Batch workflow
Task file schema
CODEBLOCK8
Required per task:
- - INLINECODE25
- INLINECODE26
- INLINECODE27
Config checklist (src/config/batch_experiment.toml)
- - INLINECODE29
- INLINECODE30
- INLINECODE31
- INLINECODE32
- INLINECODE33
- API key source
Batch execution
- 1. validate JSON schema and config paths
- choose mode and INLINECODE34
- run batch command
- summarize per-task outcomes
- provide one global improvement recommendation
Recommended batch report fields:
- - total/completed/failed counts
- per-task status list
- recurring issue patterns
- one highest-impact next change
API requirements
Required credential:
- -
OPENROUTER_API_KEY (preferred)
Resolution order:
- 1. environment variable INLINECODE36
- INLINECODE37 in TOML (fallback)
Rules:
- - never hardcode real keys in source files
- never print full keys in logs/outputs/reports
- fail fast if key is missing with an actionable message
Script usage rules
- - script-first: use repository entrypoints before custom commands
- non-interactive CLI only
- explicit flags and paths
- deterministic behavior preferred
- clear, actionable error messages
If adding helper scripts:
- 1. place under INLINECODE38
- use CLI flags (no prompts)
- return stable, parseable summaries
- document usage in this file
Environment checklist
Before running:
- 1. Python environment available
- dependencies installed (
pip install -e src) - Playwright Chromium installed (
python -m playwright install chromium) - API key configured
- config/task paths valid
Output contract
Each run summary should include:
- 1. execution metadata: run type, mode, task IDs
- outcome: status and evidence summary
- diagnosis: root-cause hypothesis
- next action: one highest-impact recommendation
Boundaries
- - do not claim completion without evidence
- do not skip issue summary
- avoid large refactors before instruction/config fixes
- avoid interactive prompts in core workflow
One-line identity
Avenir-Web execution and reliability skill: mode selection + instruction design + run analysis + iteration planning.
Avenir-Web
该技能的功能
该技能运行Avenir-Web以执行可靠的网页任务并进行迭代。
职责:
- - 运行单个任务和批量任务
- 选择模式(headless / headed / demo)
- 在执行前优化指令质量
- 分析运行输出并推荐下一个最佳变更
- 执行单一原子操作,无需策略/检查清单开销
- 通过截取当前页面截图并向主模型提问来读取页面
适用于以下请求:
- - 在网站上运行任务
- 运行任务列表并总结结果
- 通过更好的指令/配置提高成功率
规范入口点
单个任务:
bash
python example.py --task <指令> --website <网址> --mode headless
原子操作:
bash
python scripts/atomic_action.py --action CLICK --website <网址> --coords 500,500
读取页面:
bash
python scripts/read_page.py --website <网址> --question <问题>
批量任务:
bash
cd src
python runagent.py -c config/batchexperiment.toml
优先使用这些脚本而非临时命令。
快速使用示例
单个任务:
bash
python example.py \
--task 在openrouter.ai上,列出支持图像输入的可蒸馏模型,按价格升序排列。 \
--website https://openrouter.ai/ \
--mode demo
批量任务:
bash
cd src
python runagent.py -c config/batchexperiment.toml
原子操作:
bash
python scripts/atomic_action.py \
--action TYPE \
--website https://example.com/ \
--coords 500,420 \
--value hello
读取页面:
bash
python scripts/read_page.py \
--website https://openrouter.ai/ \
--question 此页面上可见哪些模型或价格?
运行模式
| 模式 | 行为 | 最佳用途 |
|---|
| headless | 无可见浏览器窗口 | 快速、可复现的运行和大型批量任务 |
| headed |
可见浏览器窗口 | 无需演示覆盖层的手动观察 |
| demo | 可见窗口 + 覆盖层/仪表盘控制 | 实时调试和演示 |
注意:
- - 如果未指定模式,使用 headless
- demo 提高可观察性,而非模型智能
模式选择:
- 1. 基准测试/批量任务 -> headless
- 可视化调试 -> headed
- 演示/控制流可见性 -> demo
指令设计
confirmed_task 应包含:
- 1. 目标
- 约束条件
- 完成条件
模板:
- - 在 <网站> 上,<目标>。应用约束条件:<约束条件>。当 <可观察的完成状态> 时完成。
保持单一目标、具体且可验证。
单任务工作流
输入:
- - task
- website
- 可选 mode、task-id、output-dir
步骤:
- 1. 检查环境和API密钥
- 验证指令质量
- 运行 example.py
- 检查输出
- 报告状态 + 原因 + 下一步操作
推荐报告字段:
- - task_id
- 状态:success / partial / failed
- 证据摘要
- 单行原因
- 一个推荐的下一步操作
原子操作工作流
当只需要一个浏览器操作且不需要策略/检查清单生成时,使用 scripts/atomic_action.py。
典型用途:
特性:
- - 禁用策略生成
- 禁用检查清单生成
- 精确执行一个操作
- 返回包含结果、URL、截图路径和输出目录的结构化JSON
读取页面工作流
当想通过截图检查当前页面并向主模型直接提问时,使用 scripts/read_page.py。
特性:
- - 打开页面
- 捕获截图
- 将截图及页面元数据发送给主模型
- 返回包含答案和截图路径的结构化JSON
批量工作流
任务文件模式
json
[
{
taskid: exampletask_001,
confirmed_task: 查找支持图像输入的可蒸馏模型,按价格升序排列。,
website: https://openrouter.ai/
}
]
每个任务必需:
- - taskid
- confirmedtask
- website
配置检查清单(src/config/batch_experiment.toml)
- - [basic].savefiledir
- [experiment].taskfilepath
- [experiment].max_op
- [playwright].mode
- [model].name
- API密钥来源
批量执行
- 1. 验证JSON模式和配置路径
- 选择模式和 max_op
- 运行批量命令
- 总结每个任务的结果
- 提供一个全局改进建议
推荐批量报告字段:
- - 总计/完成/失败数量
- 每个任务的状态列表
- 重复出现的问题模式
- 一个影响最大的下一步变更
API要求
必需凭证:
解析顺序:
- 1. 环境变量 OPENROUTERAPIKEY
- TOML中的 [apikeys].openrouterapi_key(备用)
规则:
- - 切勿在源文件中硬编码真实密钥
- 切勿在日志/输出/报告中打印完整密钥
- 如果密钥缺失,快速失败并提供可操作的消息
脚本使用规则
- - 脚本优先:使用仓库入口点而非自定义命令
- 仅限非交互式CLI
- 显式标志和路径
- 优先确定性行为
- 清晰、可操作的错误消息
如果添加辅助脚本:
- 1. 放置在 scripts/ 下
- 使用CLI标志(无提示)
- 返回稳定、可解析的摘要
- 在此文件中记录用法
环境检查清单
运行前:
- 1. Python环境可用
- 依赖已安装(pip install -e src)
- Playwright Chromium已安装(python -m playwright install chromium)
- API密钥已配置
- 配置/任务路径有效
输出约定
每次运行摘要应包括:
- 1. 执行元数据:运行类型、模式、任务ID
- 结果:状态和证据摘要
- 诊断:根本原因假设
- 下一步操作:一个影响最大的建议
边界
- - 无证据不得声称完成
- 不得跳过问题摘要
- 在指令/配置修复前避免大规模重构
- 核心工作流中避免交互式提示
一句话定位
Avenir-Web执行与可靠性技能:模式选择 + 指令设计 + 运行分析 + 迭代规划。