Browser Bridge CLI
When to use
Use this skill when you need to control a real Chrome tab. Typical situations:
- - browser automation with live user browser context
- page observation (interactive elements and DOM snapshots)
- remote tab actions (navigate, click, type, press_key, scroll)
- troubleshooting connection state between agent and browser
Project:
- - https://github.com/NmadeleiDev/browseragentbridge
What this gives you
This workflow has three connected parts:
- - Browser extension in Chrome receives tab commands.
- Bridge server routes messages between browser and operator.
- Operator CLI sends commands and reads results.
CLI commands used:
- -
browser-bridge-server to run the server - INLINECODE1 to run operator actions
Prerequisites
- - Python 3.10+
- Chrome browser
- Terminal access
- Ability to load an unpacked Chrome extension
Agent responsibility before startup
Before starting the server, generate strong tokens. Do not use weak defaults.
Example token generation:
CODEBLOCK0
Use generated values when starting the server. Share only the client token (BRIDGE_SHARED_TOKEN) with the user for extension setup. Keep operator token for agent CLI usage.
Install the CLI
CODEBLOCK1
Upgrade later:
CODEBLOCK2
Start the bridge server
Use static auth for straightforward local setup:
CODEBLOCK3
Start browser-bridge-server in the background. Do not leave it attached to the current shell, because the agent needs that shell for follow-up CLI commands, status checks, and diagnostics. If startup needs verification, inspect the log file or process state after backgrounding it.
Default endpoints:
- - Extension client WS: INLINECODE4
- Operator CLI WS: INLINECODE5
Connect the Chrome extension (tell your human to do this)
- 1. Open
chrome://extensions. - Enable
Developer mode. - Click
Load unpacked. - Select the extension provided by this project from https://github.com/NmadeleiDev/browseragentbridge (
extension/ folder). - Open the Browser Bridge extension popup.
- Fill fields:
- -
Bridge Server WS URL: INLINECODE11 - INLINECODE12 : INLINECODE13
- INLINECODE14 : INLINECODE15
- INLINECODE16 : value of
BRIDGE_SHARED_TOKEN generated by the agent
- 7. Click
Save, then Connect. - Confirm popup status is connected to the server started by the agent.
Operator CLI usage
All examples use:
- - INLINECODE20
- INLINECODE21
- operator token INLINECODE22
- operator websocket INLINECODE23
You can pass the operator token either with --token or by exporting BRIDGE_OPERATOR_TOKEN. The examples below use --token explicitly for clarity.
List connected browser clients:
CODEBLOCK4
Check whether the specific client is connected:
CODEBLOCK5
Check whether tab command channel is ready:
CODEBLOCK6
Observe interactive nodes on current page:
CODEBLOCK7
Get page HTML snapshot:
CODEBLOCK8
Navigate with adaptive load wait:
CODEBLOCK9
Click without load wait:
CODEBLOCK10
Type into an element:
CODEBLOCK11
Press a special key:
CODEBLOCK12
INLINECODE27 supports:
- - keys:
Enter, Tab, Escape, Backspace, Delete, ArrowUp, ArrowDown, ArrowLeft, ArrowRight, Home, End, PageUp, PageDown, INLINECODE41 - aliases:
return, esc, del, up, down, left, right, INLINECODE49 - modifiers:
alt_key, ctrl_key, meta_key, INLINECODE53 - target selection via
selector, ref, click_ref, or INLINECODE57 - default target: current
document.activeElement when no selector/ref is provided
Recommended execution flow for agents
- 1. Ensure server process is running.
- Ensure extension popup is connected with matching
instance_id, client_id, and token. - Run
list-clients. - Run
connect-status. - Run
ping-tab. - Run
observe before action commands. - Run
send-command actions (navigate, click, type, press_key, scroll, get_html). - Re-run
observe to confirm page state after actions.
Troubleshooting
- Verify popup shows connected.
- Verify
instance_id and
client_id exactly match CLI flags.
- Reconnect extension and retry.
- -
Operator auth failed or auth errors
- Verify
--token matches
BRIDGE_OPERATOR_TOKEN.
- Increase
--timeout-s.
- For action commands, disable or reduce load wait in payload.
- Confirm active tab is a normal webpage (not restricted pages like
chrome://*).
- Retry once; extension can reinject content script when needed.
- - Slow responses on action commands
- Use
wait_for_load=false for immediate response.
- Or set smaller
wait_for_load_ms.
Security notes
- - Treat tokens as secrets.
- For non-local deployments, use TLS (
wss://) and strong secrets.
Done criteria
- 1.
list-clients returns expected client. - INLINECODE87 is connected.
- INLINECODE88 reports ready.
- INLINECODE89 returns page data.
- INLINECODE90 actions return valid results.
Browser Bridge CLI
使用场景
当需要控制真实的Chrome标签页时使用此技能。典型场景包括:
- - 在用户实时浏览器上下文中进行浏览器自动化操作
- 页面观察(交互元素和DOM快照)
- 远程标签页操作(导航、点击、输入、按键、滚动)
- 排查代理与浏览器之间的连接状态问题
项目地址:
- - https://github.com/NmadeleiDev/browseragentbridge
功能概述
该工作流包含三个相互关联的组件:
- - Chrome浏览器扩展接收标签页命令
- 桥接服务器在浏览器和操作者之间路由消息
- 操作者CLI发送命令并读取结果
使用的CLI命令:
- - browser-bridge-server 用于运行服务器
- browser-bridge 用于执行操作者操作
前置条件
- - Python 3.10+
- Chrome浏览器
- 终端访问权限
- 能够加载未打包的Chrome扩展
启动前代理职责
在启动服务器前,生成强令牌。请勿使用弱默认值。
令牌生成示例:
bash
python3 - <
import secrets
print(BRIDGESHAREDTOKEN= + secrets.token_urlsafe(32))
print(BRIDGEOPERATORTOKEN= + secrets.token_urlsafe(32))
PY
启动服务器时使用生成的令牌值。仅将客户端令牌(BRIDGESHAREDTOKEN)分享给用户用于扩展配置。操作者令牌保留供代理CLI使用。
安装CLI
bash
python3 -m pip install --user pipx
python3 -m pipx ensurepath
pipx install browser-agent-bridge
后续升级:
bash
pipx upgrade browser-agent-bridge
启动桥接服务器
使用静态认证进行简单的本地配置:
bash
export BRIDGEAUTHMODE=static
export BRIDGESHAREDTOKEN=change-me-strong-token
export BRIDGEOPERATORTOKEN=Str0ng!Operator#42
browser-bridge-server >/tmp/browser-bridge-server.log 2>&1 &
echo $! >/tmp/browser-bridge-server.pid
在后台启动browser-bridge-server。不要将其附加到当前shell,因为代理需要该shell执行后续CLI命令、状态检查和诊断。如果需要验证启动状态,请在后台运行后检查日志文件或进程状态。
默认端点:
- - 扩展客户端WebSocket:ws://127.0.0.1:8765/ws/client
- 操作者CLI WebSocket:ws://127.0.0.1:8765/ws/operator
连接Chrome扩展(请告知用户执行)
- 1. 打开chrome://extensions。
- 启用开发者模式。
- 点击加载已解压的扩展程序。
- 从https://github.com/NmadeleiDev/browseragentbridge选择该项目提供的扩展(extension/文件夹)。
- 打开Browser Bridge扩展弹出窗口。
- 填写字段:
- - 桥接服务器WebSocket URL:ws://127.0.0.1:8765/ws/client
- 实例ID:local-instance
- 客户端ID:chrome-main
- 认证令牌/JWT:代理生成的BRIDGESHAREDTOKEN值
- 7. 点击保存,然后点击连接。
- 确认弹出窗口状态显示已连接到代理启动的服务器。
操作者CLI使用
所有示例使用:
- - instanceid=local-instance
- clientid=chrome-main
- 操作者令牌 Str0ng!Operator#42
- 操作者WebSocket ws://127.0.0.1:8765/ws/operator
可以通过--token参数或导出BRIDGEOPERATORTOKEN环境变量传递操作者令牌。以下示例为清晰起见明确使用--token。
列出已连接的浏览器客户端:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 list-clients
检查特定客户端是否已连接:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
connect-status --instance-id local-instance --client-id chrome-main
检查标签页命令通道是否就绪:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
ping-tab --instance-id local-instance --client-id chrome-main
观察当前页面上的交互节点:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
observe --instance-id local-instance --client-id chrome-main --max-nodes 150
获取页面HTML快照:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
send-command --instance-id local-instance --client-id chrome-main \
--type gethtml --payload {maxchars:40000}
自适应加载等待导航:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
send-command --instance-id local-instance --client-id chrome-main \
--type navigate --payload {url:https://example.com,waitforload:true,waitforload_ms:7000}
无加载等待点击:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
send-command --instance-id local-instance --client-id chrome-main \
--type click --payload {selector:a[href],waitforload:false}
在元素中输入文本:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
send-command --instance-id local-instance --client-id chrome-main \
--type type --payload {selector:input[name=q],text:browser bridge}
按下特殊键:
bash
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token Str0ng!Operator#42 \
send-command --instance-id local-instance --client-id chrome-main \
--type press_key --payload {key:Enter,selector:input[name=q]}
press_key支持:
- - 按键:Enter、Tab、Escape、Backspace、Delete、ArrowUp、ArrowDown、ArrowLeft、ArrowRight、Home、End、PageUp、PageDown、Space
- 别名:return、esc、del、up、down、left、right、spacebar
- 修饰键:altkey、ctrlkey、metakey、shiftkey
- 通过selector、ref、click_ref或locator选择目标
- 默认目标:未提供选择器/引用时,当前document.activeElement
代理推荐执行流程
- 1. 确保服务器进程正在运行。
- 确保扩展弹出窗口已连接,且instanceid、clientid和令牌匹配。
- 运行list-clients。
- 运行connect-status。
- 运行ping-tab。
- 在执行操作命令前运行observe。
- 运行send-command操作(navigate、click、type、presskey、scroll、gethtml)。
- 重新运行observe以确认操作后的页面状态。
故障排除
- 确认弹出窗口显示已连接。
- 确认instance
id和clientid与CLI标志完全匹配。
- 重新连接扩展并重试。
- 确认--token与BRIDGE
OPERATORTOKEN匹配。
- 增加--timeout-s。
- 对于操作命令,在payload中禁用或减少加载等待。
- 确认活动标签页是普通网页(非受限页面如chrome://*)。