Web Autopilot
Record once in any web app, let AI handle it from now on.
Overview
Record → Analyze → Confirm Fields → Generate → Test → Register as Tool
CODEBLOCK0
Task Types
📊 Query / Export
Data extraction and report generation. Scripts run and output results automatically — no manual intervention needed.
Examples: pull sales reports, extract project data, export revenue details
📝 Submit
Submit forms such as expense reports, travel requests, payment requests, etc.
Each run requires dynamic parameters.
Examples: submit travel request, submit expense report, submit payment request
The key challenge for submit tasks: correctly distinguishing which fields are fixed vs. which change every time, and confirming with the user before generating the script.
Skill Directory
CODEBLOCK1
Skill scripts: /opt/homebrew/lib/node_modules/openclaw/skills/web-autopilot/scripts/
Commands
1. record — Record a workflow
Ask user: task name, login URL or app URL.
CODEBLOCK2
Run in PTY mode (pty: true, background: true). User operates browser, types "done" when finished.
Note: --sso-url is a legacy parameter name; it works for any login URL (SSO, OAuth, plain login page, etc.).
2. analyze — Analyze the recording (AI does this)
Read recording.json, separate login traffic from business traffic, identify core APIs.
Key steps:
- 1. Read
~/.openclaw/rpa/recordings/<task>/summary.txt for overview - Parse recording.json to extract all API calls to app domain
- For each POST/PUT/PATCH with meaningful body:
- Classify fields: FIXED / DYNAMIC / SESSION / RELATIONAL
- Detect protocol: rest-json / graphql / form-urlencoded / multipart
- 4. Map the complete API sequence (prerequisites → main operation → follow-ups)
- Analyze ALL response fields and create field-mapping.json with human-readable labels
- Create task-meta.json
- [Submit tasks] After analysis, present the field classification confirmation table to the user (see below)
Field Classification
| Type | Meaning | Handling |
|---|
| FIXED | Same value every submission (approval flow ID, company entity, currency, expense type enums…) | Hardcoded in script |
| DYNAMIC |
Different each submission (amount, date, reason, attachment path…) | Becomes CLI
--parameter |
|
SESSION | Auth tokens/cookies, auto-managed | Injected by session.ts |
|
RELATIONAL | Requires a lookup from another API to get the ID (e.g., project ID, person ID…) | Auto-queried in script, or exposed as DYNAMIC parameter |
Field Analysis Rules (MANDATORY)
Every field must have a human-readable label. Including system-generated field names.
Inference priority:
- 1. Data value type: timestamp (10^12-13) / monetary amount (contextual) / enum (fixed values) / URL / JSON object
- Field name pattern: time/date/at → datetime | amount/price/cost → monetary | id/key → ID | status/state → status
- Business context: infer from related fields, API endpoint names
- If uncertain → annotate as INLINECODE6
Field Confirmation Step (MANDATORY for Submit tasks)
After analysis, you must present the following confirmation table to the user and wait for confirmation before generating the script:
CODEBLOCK3
Only proceed to the Generate step after user confirmation.
CSV Export Rules (MANDATORY)
- - Keep ALL fields, including hidden fields, dynamic fields, system fields — never crop
- Field order: preserve original order from data, never sort (sorting causes column misalignment)
- JSON/object fields → convert to JSON string for storage
- Use
csv.writer + proper quoting to handle JSON fields containing commas
3. generate — Generate the task script
Pre-generation checklist (Query/Export tasks):
- - ✅ All fields are in field-mapping.json
- ✅ All fields have human-readable labels
- ✅ CSV export uses field-mapping.json for column headers
- ✅ Field order preserves original order
Pre-generation checklist (Submit tasks):
- - ✅ User has confirmed field classification (FIXED / DYNAMIC / RELATIONAL)
- ✅ All DYNAMIC fields converted to CLI parameters (with type, example value, required/optional)
- ✅ RELATIONAL fields have auto-query logic or corresponding parameters
- ✅ Script has
--dry-run mode (prints request body without submitting, for testing) - ✅ Script outputs submission result (success/failure + document number/link)
Submit task invocation example (written to task-meta.json usage field after generation):
CODEBLOCK4
4. test — Iterative test loop (max 5 rounds)
Run script → check output → if error: diagnose → fix → repeat.
| Error | Cause | Fix |
|---|
| 401/403 | Session expired / wrong auth | Re-check auth headers, re-login |
| 400 |
Wrong field name/type | Compare with recording |
| 404 | Wrong URL | Check URL exactly |
| JSON parse error | Response is HTML | Log resp.raw |
5. run — Execute a registered task
CODEBLOCK5
6. list — List all tasks
CODEBLOCK6
Session & Credential Management
Session (Cookie/Token Storage)
Sessions are cookie-based and work with any login method:
- - SSO (OIDC, SAML, CAS, etc.)
- OAuth / OAuth2
- Username + password forms
- Any browser-based authentication
Session files: INLINECODE14
Credentials (Encrypted Storage)
Login credentials are stored encrypted (AES-256-GCM) in a separate file — never stored in plaintext.
File: ~/.openclaw/rpa/credentials.enc
- - Encryption key = machine identity (hostname+username) + optional
RPA_CREDENTIAL_KEY env var - File permissions: 0600 (owner only)
- Supports automatic extraction and encrypted storage from recording.json
CODEBLOCK7
Auto-Login Flow
When a session expires, the auto-login flow kicks in:
CODEBLOCK8
Login Flow Types
When generating scripts, you must identify the login type from the recording and write it to the loginFlow field in task-meta.json:
| type | Scenario | Auto-login method | Example |
|---|
api | SSO/app provides a REST login endpoint, single POST completes auth | Call API directly → follow redirects | Enterprise SSO (POST /api/sso/login) |
form |
Single-page login form (username + password on same page) | Fill form fields → click submit | Common admin dashboards |
|
multi-step | Multi-step login (email → next page → password → next page → possible 2FA) | Execute step sequence | Google, Microsoft, Okta |
|
manual-only | Has CAPTCHA/2FA/risk control, cannot be fully automated | Open headed browser directly | Banking systems, strong CAPTCHA sites |
loginFlow Schema (task-meta.json)
CODEBLOCK9
Login Identification Guide for Analyze Step (MANDATORY)
During the analyze step, you must complete the following login analysis:
- 1. Extract credentials →
credentials.ts extract <recording.json> (auto-detects username/password in POST body) - Identify login type → Inspect the login flow in the recording:
- Has a clear
POST login/auth API → type =
api
- Has form fill actions (password type input) on the same page → type =
form
- Has multiple form fill actions with page navigations in between → type =
multi-step
- Has CAPTCHA image requests or reCAPTCHA scripts → type =
manual-only
- 3. Document the SSO → app redirect path:
- Does it use an appId forward?
- Does it use a redirect_uri callback?
- Where is the token — in URL query / response body / cookie?
- 4. Write loginFlow → Write all fields to INLINECODE30
- Sanitize → Replace passwords in recording.json with INLINECODE31
⚠️ If credentials.ts extract cannot extract credentials (e.g., Google multi-step login), prompt the user to save credentials manually:
CODEBLOCK10
Login Code Templates for Script Generation
Choose the auto-login implementation based on loginFlow.type:
type=api (REST API login):
CODEBLOCK11
type=form:
CODEBLOCK12
type=multi-step:
CODEBLOCK13
type=manual-only:
CODEBLOCK14
task-meta.json loginFlow example (SSO → enterprise app):
CODEBLOCK15
⚠️ Security Rules (MANDATORY)
- 1. Passwords in recording.json must be sanitized immediately after analysis (replace with
[REDACTED]) - credentials.enc is an encrypted binary file — do not attempt to read or edit directly
- credentials.enc and sessions/ directory must never be committed to version control or shared
- Skill packages (.skill) must not contain any credentials, sessions, or recording data
- Generated task scripts (run.ts) must never hardcode any passwords
Known Issues & Lessons Learned
🔐 Login flow must match app — don't assume one-size-fits-all
- - Current implemented scripts use
type=api mode (enterprise SSO → business app) - Each new app recording must re-identify the login type — do not reuse login logic from old scripts
- Google/Microsoft multi-step logins require
type=multi-step + steps sequence - Sites with CAPTCHA/2FA can only use INLINECODE36
- Inlining login logic into run.ts (rather than importing external login.ts) is more stable due to Node ESM/CJS compatibility issues
⚠️ Node v25 ESM compatibility
- - Node v25 defaults to ESM,
require() is unavailable - Solution: place
tsconfig.json in the task directory to force INLINECODE39 - Dependencies like Playwright need full-path require: INLINECODE40
- Cross-directory .ts imports under ts-node are unstable — recommend inlining critical logic into run.ts
⚠️ Multi-tab traffic capture (fixed)
Some login flows or apps open new tabs. Recorder uses
context.on('request/response') to capture ALL tabs.
📋 CSV must include ALL fields with human-readable labels
- - Never crop fields — include everything from the API response
- System-generated field names (e.g.
field_*, attr_*, custom_*) must be analyzed from sample data - Create field-mapping.json for every task
- Field order: preserve original order from data, never sort
- Use proper CSV quoting to handle JSON fields with commas
📝 Submit tasks: always confirm field classification before generating
- - Never skip the field confirmation step — wrong FIXED/DYNAMIC split breaks every future submission
- Fields that look fixed (e.g. a hardcoded project ID) might actually need to be dynamic in real use
- Always include
--dry-run in generated scripts so users can verify the request body before committing - RELATIONAL fields (e.g. approver ID looked up by name) should be auto-resolved in script, exposed as human-readable params
File Locations
| Item | Path |
|---|
| Recorder | INLINECODE46 |
| Task runner |
scripts/run-task.ts |
| Session utility |
scripts/utils/session.ts |
| Login helper |
scripts/utils/login.ts |
| Recordings |
~/.openclaw/rpa/recordings/<task>/ |
| Generated tasks |
~/.openclaw/rpa/tasks/<task>/ |
| Sessions |
~/.openclaw/rpa/sessions/<domain>.session.json |
Web Autopilot
在任何Web应用中录制一次,之后让AI自动处理。
概述
录制 → 分析 → 确认字段 → 生成 → 测试 → 注册为工具
🎬 录制 用户在真实浏览器中执行一次工作流(登录后)
🔍 分析 AI分析网络流量,分类固定/动态/会话字段
✅ 确认字段 [提交类任务必需] 用户确认字段分类
📝 生成 生成可复用的TS脚本 + 字段映射
🧪 测试 迭代测试循环,最多5轮自动修复
🔧 注册 注册为OpenClaw工具,可直接调用
任务类型
📊 查询/导出
数据提取和报告生成。脚本自动运行并输出结果——无需人工干预。
示例:拉取销售报告、提取项目数据、导出收入明细
📝 提交
提交表单,如费用报告、出差申请、付款请求等。
每次运行需要动态参数。
示例:提交出差申请、提交费用报告、提交付款请求
提交类任务的关键挑战:正确区分哪些字段是固定的,哪些每次都会变化,并在生成脚本前与用户确认。
技能目录
~/.openclaw/rpa/
├── recordings/<任务名称>/recording.json
├── tasks/<任务名称>/
│ ├── task-meta.json
│ ├── run.ts
│ └── field-mapping.json
└── sessions/<域名>.session.json
技能脚本:/opt/homebrew/lib/node_modules/openclaw/skills/web-autopilot/scripts/
命令
1. record — 录制工作流
询问用户:任务名称、登录URL或应用URL。
bash
cd /opt/homebrew/lib/node_modules/openclaw/skills/web-autopilot
选项A:从登录页面开始(SSO、OAuth、用户名/密码等)
npx ts-node scripts/record.ts --name my-task --sso-url https://login.example.com
选项B:直接从应用开始(如果已登录或无需登录)
npx ts-node scripts/record.ts --name my-task --app-url https://app.example.com
在PTY模式下运行(pty: true, background: true)。用户操作浏览器,完成后输入done。
注意:--sso-url是旧版参数名称;它适用于任何登录URL(SSO、OAuth、普通登录页面等)。
2. analyze — 分析录制内容(由AI完成)
读取recording.json,分离登录流量和业务流量,识别核心API。
关键步骤:
- 1. 读取~/.openclaw/rpa/recordings/<任务>/summary.txt获取概览
- 解析recording.json,提取所有对应用域的API调用
- 对于每个有意义的POST/PUT/PATCH请求体:
- 分类字段:固定 / 动态 / 会话 / 关联
- 检测协议:rest-json / graphql / form-urlencoded / multipart
- 4. 映射完整的API序列(前置条件 → 主要操作 → 后续操作)
- 分析所有响应字段,创建带有人工可读标签的field-mapping.json
- 创建task-meta.json
- [提交类任务] 分析后,向用户展示字段分类确认表(见下文)
字段分类
| 类型 | 含义 | 处理方式 |
|---|
| 固定 | 每次提交值相同(审批流ID、公司实体、货币、费用类型枚举等) | 硬编码到脚本中 |
| 动态 |
每次提交不同(金额、日期、原因、附件路径等) | 变为CLI --参数 |
|
会话 | 认证令牌/ Cookie,自动管理 | 由session.ts注入 |
|
关联 | 需要从其他API查询获取ID(如项目ID、人员ID等) | 脚本中自动查询,或暴露为动态参数 |
字段分析规则(必选)
每个字段都必须有一个人工可读的标签。 包括系统生成的字段名。
推断优先级:
- 1. 数据值类型:时间戳(10^12-13)/ 货币金额(上下文相关)/ 枚举(固定值)/ URL / JSON对象
- 字段名模式:time/date/at → 日期时间 | amount/price/cost → 货币金额 | id/key → ID | status/state → 状态
- 业务上下文:从相关字段、API端点名称推断
- 如果不确定 → 标注为(含义未知:样本值)
字段确认步骤(提交类任务必选)
分析完成后,你必须向用户展示以下确认表,并在生成脚本前等待用户确认:
📋 字段分类确认 — <任务名称>
✅ 固定(硬编码):
- approvalFlowId: xxx → 审批流ID
- companyId: yyy → 公司实体
- currency: CNY → 货币
🔄 动态(每次运行时作为参数传入):
- amount → 金额(示例:--amount 1500)
- startDate → 开始日期(示例:--startDate 2026-03-10)
- endDate → 结束日期(示例:--endDate 2026-03-12)
- destination → 目的地(示例:--destination New York)
- reason → 原因(示例:--reason 客户拜访)
- attachments → 附件路径(示例:--attachments ~/Desktop/receipt.jpg)
🔗 关联(自动查询):
- projectId → 项目ID(按项目名称自动查找,--projectName Project X)
❓ 需要确认(AI不确定):
- field_abc123 → 含义未知(录制值:0),建议:固定(0) 或 动态?
请确认以上分类,或指出需要调整的字段。
只有在用户确认后才能进入生成步骤。
CSV导出规则(必选)
- - 保留所有字段,包括隐藏字段、动态字段、系统字段——绝不裁剪
- 字段顺序:保留数据原始顺序,绝不排序(排序会导致列错位)
- JSON/对象字段 → 转换为JSON字符串存储
- 使用csv.writer + 适当的引号处理来包含逗号的JSON字段
3. generate — 生成任务脚本
生成前检查清单(查询/导出类任务):
- - ✅ 所有字段都在field-mapping.json中
- ✅ 所有字段都有人工可读标签
- ✅ CSV导出使用field-mapping.json作为列标题
- ✅ 字段顺序保留原始顺序
生成前检查清单(提交类任务):
- - ✅ 用户已确认字段分类(固定 / 动态 / 关联)
- ✅ 所有动态字段已转换为CLI参数(包含类型、示例值、必需/可选)
- ✅ 关联字段有自动查询逻辑或对应参数
- ✅ 脚本有--dry-run模式(打印请求体而不提交,用于测试)
- ✅ 脚本输出提交结果(成功/失败 + 文档编号/链接)
提交类任务调用示例(生成后写入task-meta.json的usage字段):
bash
预览(不实际提交)
npx ts-node run.ts --dry-run --amount 1500 --startDate 2026-03-10 ...
实际提交
npx ts-node run.ts --amount 1500 --startDate 2026-03-10 --destination New York --reason 客户拜访
4. test — 迭代测试循环(最多5轮)
运行脚本 → 检查输出 → 如果出错:诊断 → 修复 → 重复。
| 错误 | 原因 | 修复 |
|---|
| 401/403 | 会话过期 / 认证错误 | 重新检查认证头,重新登录 |
| 400 |
字段名/类型错误 | 与录制内容对比 |
| 404 | URL错误 | 精确检查URL |
| JSON解析错误 | 响应是HTML | 记录resp.raw |
5. run — 执行已注册的任务
bash
npx ts-node ~/.openclaw/rpa/tasks/<任务>/run.ts --param1 value1
6. list — 列出所有任务
bash
npx ts-node /opt/homebrew/lib/node_modules/openclaw/skills/web-autopilot/scripts/run-task.ts --list
会话与凭据管理
会话(Cookie/令牌存储)
会话基于Cookie,适用于任何登录方式:
- - SSO(OIDC、SAML、CAS等)
- OAuth / OAuth2
- 用户名 + 密码表单
- 任何基于浏览器的认证
会话文件: