Web Collection
Use this skill for browser-extension collection tasks on:
- - Douyin
- TikTok
- Xiaohongshu
- Amazon
- Bilibili
Human Tutorial
This skill is implementation-oriented. For end-user onboarding and step-by-step human instructions, use:
- - https://vcn5grhrq8y0.feishu.cn/wiki/BGUhwC0cKimwTJkXDSuc0YFBnib
When the user asks "how to use it" or needs UI-level operation guidance, prefer this tutorial and keep the response aligned with its wording.
Core Rules
- 1. Use the user's normal Chrome environment, not the isolated
openclaw browser profile. - Prefer the connector flow over generic browser tooling.
- Default to synchronous closed-loop execution.
- Do not reply before the collection script finishes.
- Choose one execution mode first:
-
local: talk to the local bridge directly
-
cloud: call the cloud connector dispatch API, which forwards to the user's connected local connector
- 6. In
cloud mode, do not rewrite the collection payload. Only wrap it in:
-
device_id
-
action
- INLINECODE6
First-Time Setup
This skill uses one preferences file:
INLINECODE7
Fallback:
INLINECODE8
Helper script:
CODEBLOCK0
Required defaults:
- - INLINECODE9
- INLINECODE10
- INLINECODE11
- INLINECODE12
Mode-specific defaults:
- no extra required key if the default local bridge URL works
- cloud base URL is fixed to
https://i-sync.cn by default
-
defaultCloudDeviceId
- INLINECODE17
INLINECODE18 enforces this. If these are incomplete, collection must not start.
First-run flow
On first use:
- 1. Determine the execution mode first:
-
local
-
cloud
- 2. If the mode is
cloud, collect these values first:
-
defaultCloudDeviceId
-
defaultCloudToken
- cloud URL is fixed and does not need user input
- 3. Then handle the common defaults:
- 导出方式
- 默认采集条数
- 是否默认采集详情
- 默认采集速度
- 4. Ask only one question for the common defaults:
-
推荐配置
-
自己配置
- 5. If the user chooses
推荐配置, run:
CODEBLOCK1
- 6. If the user chooses
自己配置, ask for all four values in one message, not one by one. - Only continue when
check passes.
Preferred mode prompt:
CODEBLOCK2
Preferred cloud prompt:
CODEBLOCK3
Preferred quick-reply prompt for common defaults:
CODEBLOCK4
Preferred custom-config prompt:
CODEBLOCK5
Recommended defaults:
- - 运行位置: INLINECODE29
- 导出方式: INLINECODE30
- 采集条数: INLINECODE31
- 采集详情: INLINECODE32
- 采集速度: INLINECODE33
Cloud Mode
Use cloud mode when the collection request should be sent to the platform backend first, and then dispatched to the user's connected local connector.
Cloud responsibilities:
- - call INLINECODE35
- authenticate with INLINECODE36
- include INLINECODE37
- keep the collection body unchanged inside INLINECODE38
- enforce a strict cloud payload template before dispatch to avoid missing fields
- default fallback when missing:
maxItems=20,
mode=search,
interval=300,
fetchDetail=true,
detailSpeed=fast
- - poll
/api/v1/connector/cloud/commands/{command_id} for final status and result - if single-command query is unavailable, fallback to INLINECODE45
- treat
result + task_updates as the source of completion snapshot
Do not:
- - call the user's local
19820 port from the cloud path - rewrite
payload semantics - mix local admin token logic into cloud requests
Connector Command Ladder
When collection fails, parameters look incomplete, or status is unclear, run connector checks in this order instead of guessing.
Layer 1: capability
- - INLINECODE50
- INLINECODE51
- INLINECODE52 (or platform/method scoped)
Layer 2: diagnostics
- - INLINECODE53
- INLINECODE54
- INLINECODE55
- INLINECODE56 with the final request body
Layer 3: execution and tracking
- - INLINECODE57
- INLINECODE58 (local mode)
- INLINECODE59 (cloud mode, preferred)
- INLINECODE60 (cloud fallback)
- INLINECODE61 or
POST /api/reset when stuck
Local command template (admin token required):
CODEBLOCK6
Cloud command template (async result):
CODEBLOCK7
Export Behavior
- run with
--export-target bitable
- expect
export.tableUrl on success
- run with
--export-target csv
- do not require a table link in the final reply
Entry Point
Preferred wrapper:
CODEBLOCK8
The wrapper:
- - applies stored preferences
- enforces required setup
- runs either local bridge mode or cloud dispatch mode
- prefers the connector repo's real local bridge loop when available
- falls back to the bundled local loop only if needed
Common Commands
Douyin keyword search:
CODEBLOCK9
Douyin keyword search via cloud dispatch:
CODEBLOCK10
Amazon keyword search:
CODEBLOCK11
Bilibili keyword search:
CODEBLOCK12
Platform Defaults
Wrapper defaults:
- -
douyin => INLINECODE69 - INLINECODE70 => INLINECODE71
- INLINECODE72 => INLINECODE73
- INLINECODE74 => INLINECODE75
- INLINECODE76 => INLINECODE77
Supported methods:
- -
douyin: videoKeyword, creatorKeyword, creatorLink, creatorVideo, videoComment, videoInfo, INLINECODE85 - INLINECODE86 :
keywordSearch, userVideo, tiktokComment, tiktokCreatorKeyword, INLINECODE91 - INLINECODE92 :
keywordSearch, creatorNote, creatorLink, creatorKeyword, noteLink, INLINECODE98 - INLINECODE99 :
keywordSearch, productLink, INLINECODE102 - INLINECODE103 :
keywordSearch, videoInfo, creatorVideo, INLINECODE107
Closed Loop
INLINECODE108 mode:
- 1. verify INLINECODE109
- wait for idle state
- start INLINECODE110
- handle
TASK_RUNNING via INLINECODE112 - poll
/api/tasks/<taskId> until completed or INLINECODE115 - if export is required, verify the expected export result
INLINECODE116 mode:
- 1. query INLINECODE117
- dispatch
action=collect to INLINECODE119 - keep querying command result (
/api/v1/connector/cloud/commands/{command_id} preferred) - after each poll, refresh current collection state from command status
- wait for
completed or terminal error state - on completion, read
result and task_updates for records/count/export snapshot and include key fields in the final reply
Quick query examples:
CODEBLOCK13
CODEBLOCK14
Final Reply
When successful:
- 1. Mention whether the run used
local or cloud mode. - If
cloud mode was used, include the command status. - If export mode is
bitable and export.tableUrl exists, include the table link first. - If export mode is
csv, explicitly say export mode is CSV. - Then include:
- status
- export status
- collected count
- short analysis
When bitable export is expected but no table link exists, explicitly say export did not finish correctly.
Troubleshooting
- Chrome/plugin is not connected to the bridge
- in
local mode, ensure collect, status, and stop all use the same local base URL
- - cloud dispatch auth failed
- check
defaultCloudToken
- - cloud dispatch could not reach device
- check
defaultCloudDeviceId and whether the local connector is online
- use stop + retry, or
--force-stop-before-start
- - long record output hiding key fields
- trust the connector loop's compact summary output rather than raw task JSON
网页采集
使用此技能完成以下平台的浏览器扩展采集任务:
人工教程
本技能以实现为导向。如需终端用户入门指南和分步人工操作说明,请使用:
- - https://vcn5grhrq8y0.feishu.cn/wiki/BGUhwC0cKimwTJkXDSuc0YFBnib
当用户询问如何使用或需要UI级操作指导时,优先使用本教程,并保持回复与其措辞一致。
核心规则
- 1. 使用用户正常的Chrome环境,而非隔离的openclaw浏览器配置文件。
- 优先使用连接器流程而非通用浏览器工具。
- 默认采用同步闭环执行。
- 采集脚本完成前不进行回复。
- 首先选择一种执行模式:
- local:直接与本地桥接通信
- cloud:调用云端连接器分发API,该API会转发至用户已连接的本地连接器
- 6. 在cloud模式下,不重写采集负载。仅将其包裹在以下结构中:
- device_id
- action
- payload
首次设置
本技能使用一个偏好文件:
$OPENCLAWSTATEDIR/skill-state/web-collection/preferences.json
备用路径:
$HOME/.openclaw/skill-state/web-collection/preferences.json
辅助脚本:
bash
{baseDir}/scripts/export_preference.sh show
{baseDir}/scripts/export_preference.sh check
{baseDir}/scripts/export_preference.sh apply-recommended
{baseDir}/scripts/export_preference.sh set-key defaultConnectionMode cloud
{baseDir}/scripts/export_preference.sh set-key defaultCloudDeviceId desktop-local-smoke-fix
{baseDir}/scripts/exportpreference.sh set-key defaultCloudToken api_key>
{baseDir}/scripts/export_preference.sh set-key defaultExportMode csv
必需的默认配置:
- - defaultExportMode
- defaultMaxItems
- defaultFetchDetail
- defaultDetailSpeed
特定模式的默认配置:
- 如果默认本地桥接URL可用,则无需额外必需键
- 云端基础URL默认固定为https://i-sync.cn
- defaultCloudDeviceId
- defaultCloudToken
run.sh强制执行此规则。如果这些配置不完整,采集不得启动。
首次运行流程
首次使用时:
- 1. 首先确定执行模式:
- local
- cloud
- 2. 如果模式为cloud,先收集以下值:
- defaultCloudDeviceId
- defaultCloudToken
- 云端URL固定,无需用户输入
- 3. 然后处理通用默认配置:
- 导出方式
- 默认采集条数
- 是否默认采集详情
- 默认采集速度
- 4. 针对通用默认配置只问一个问题:
- 推荐配置
- 自己配置
- 5. 如果用户选择推荐配置,运行:
bash
{baseDir}/scripts/export_preference.sh apply-recommended
- 6. 如果用户选择自己配置,一次性询问所有四个值,而非逐个询问。
- 仅当check通过后才继续。
推荐模式提示:
text
第一次使用网页采集,需要先确认运行环境。
请选择:
[[quick_replies: 本地, 云端]]
推荐云端提示:
text
检测到你要走云端分发,还需要这两个配置:
请一次性发给我。
通用默认配置的推荐快速回复提示:
text
常用配置还需要确认一次。
这些配置包括:
- - 导出方式
- 默认采集条数
- 是否默认采集详情
- 默认采集速度
你可以直接用推荐配置,也可以自己配置。
[[quick_replies: 推荐配置, 自己配置]]
推荐自定义配置提示:
text
好,我们一次性把默认配置定好。请直接按下面格式回复:
导出方式:CSV / 多维表格
默认采集条数:10 / 20 / 50 / 100
是否默认采集详情:是 / 否
默认采集速度:fast / medium / slow
说明:
- - 多维表格:适合查看、筛选、分享
- CSV:适合本地保存
- 采集详情:开启后结果更完整,但一般更慢
- 采集速度:推荐 fast
推荐默认值:
- - 运行位置:local
- 导出方式:多维表格
- 采集条数:20
- 采集详情:true
- 采集速度:fast
云端模式
当采集请求应先发送至平台后端,再分发给用户已连接的本地连接器时,使用cloud模式。
云端职责:
- - 调用/api/v1/connector/cloud/dispatch
- 使用Authorization: Bearer apikey>进行身份验证
- 包含device_id
- 在payload内保持采集主体不变
- 在分发前强制执行严格的云端负载模板,以避免字段缺失
- 缺失时的默认回退值:maxItems=20、mode=search、interval=300、fetchDetail=true、detailSpeed=fast
- - 轮询/api/v1/connector/cloud/commands/{commandid}以获取最终状态和结果
- 如果单命令查询不可用,回退至/api/v1/connector/cloud/commands?deviceid=...
- 将result和task_updates视为完成快照的来源
禁止:
- - 从云端路径调用用户本地的19820端口
- 重写payload语义
- 将本地管理员令牌逻辑混入云端请求
连接器命令阶梯
当采集失败、参数不完整或状态不明确时,按此顺序运行连接器检查,而非猜测。
第一层:能力
- - GET /api/help
- GET /api/routes
- GET /api/filters(或按平台/方法限定范围)
第二层:诊断
- - GET /api/status
- GET /api/platform-state
- GET /api/cloud/status
- POST /api/preflight(使用最终请求体)
第三层:执行与跟踪
- - POST /api/collect
- GET /api/tasks/:id(本地模式)
- GET /api/v1/connector/cloud/commands/{commandid}(云端模式,推荐)
- GET /api/v1/connector/cloud/commands?deviceid=...(云端回退)
- POST /api/stop或POST /api/reset(卡住时)
本地命令模板(需要管理员令牌):
bash
TOKEN=$(cat ~/.meixi-connector/bridge-admin-token.txt)
curl -s -H x-connector-admin-token: $TOKEN http://127.0.0.1:19820/api/status
云端命令模板(异步结果):
bash
curl -s -H Authorization: Bearer orapi_key> \
https://i-sync.cn/api/v1/connector/cloud/commands/
导出行为
- 使用--export-target bitable运行
- 成功时预期export.tableUrl
- 使用--export-target csv运行
- 最终回复中不需要表格链接
入口点
推荐的包装器:
bash
bash {baseDir}/scripts/run.sh ...
包装器:
- - 应用存储的偏好设置
- 强制执行必需的设置
- 运行本地桥接模式或云端分发模式
- 优先使用连接器仓库的实际本地桥接循环(可用时)
- 仅在必要时回退至捆绑的本地循环
常用命令
抖音关键词搜索:
bash
bash {baseDir}/scripts/run.sh \
--platform douyin \
--keyword AI \
--ensure-bridge
通过云端分发进行抖音关键词搜索:
bash
bash {baseDir}/scripts/run.sh \
--connection-mode cloud \
--cloud-device-id desktop-local-smoke-fix \
--cloud-token apikey> \
--platform douyin \
--keyword AI员工
亚马逊关键词搜索:
bash
bash {baseDir}/scripts/run.sh \
--platform amazon \
--keyword Chinese porcelain \
--ensure-bridge
哔哩哔哩关键词搜索:
bash
bash {baseDir}/scripts/run.sh \
--platform bilibili \
--keyword 古董 \
--ensure-bridge
平台默认值
包装器默认值:
- - douyin => videoKeyword
- tiktok => keywordSearch
-