返回顶部
b

browser-js

Lightweight CDP browser control for AI agents. Token-efficient alternative to the built-in browser tool — 3-10x fewer tokens per interaction. Use when browsing websites, clicking elements, filling forms, uploading files, or extracting page content. Requires a Chrome/Chromium browser running with --remote-debugging-port (OpenClaw browser works out of the box). Signed-in sessions carry over automatically.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.5.0
安全检测
已通过
1,020
下载量
0
收藏
概述
安装方式
版本历史

browser-js

# browser-js Lightweight CLI that talks to Chrome via CDP (Chrome DevTools Protocol). Returns minimal, indexed output that agents can act on immediately — no accessibility tree parsing, no ref hunting. ## Setup ```bash # Install dependency (one-time, in the skill scripts/ dir) cd scripts && npm install # Ensure browser is running with CDP enabled. # With OpenClaw: # browser start profile=openclaw # Or manually: # google-chrome --remote-debugging-port=18800 --user-data-dir=~/.browser-data ``` The tool connects to `http://127.0.0.1:18800` by default. Override with `CDP_URL` env var. ## Alias setup (optional) ```bash mkdir -p ~/.local/bin cat > ~/.local/bin/bjs << 'WRAPPER' #!/bin/bash exec node /path/to/scripts/browser.js "$@" WRAPPER chmod +x ~/.local/bin/bjs ``` ## Commands ``` bjs tabs List open tabs bjs open <url> Navigate to URL bjs tab <index> Switch to tab bjs newtab [url] Open new tab bjs close [index] Close tab bjs elements [selector] List interactive elements (indexed) bjs click <index> Click element by index bjs type <index> <text> Type into element bjs upload <path> [selector] Upload file to input (bypasses OS dialog) bjs text [selector] Extract visible page text bjs html <selector> Get element HTML bjs eval <js> Run JavaScript in page bjs screenshot [path] Save screenshot bjs scroll <up|down|top|bottom> [px] bjs url Current URL bjs back / forward / refresh bjs wait <ms> Coordinate commands (cross-origin iframes, captchas, overlays): bjs click-xy <x> <y> Click at page coordinates via CDP Input bjs click-xy <x> <y> --double Double-click at coordinates bjs click-xy <x> <y> --right Right-click at coordinates bjs hover-xy <x> <y> Hover at page coordinates bjs drag-xy <x1> <y1> <x2> <y2> Drag between coordinates bjs iframe-rect <selector> Get iframe bounding box (for click-xy targeting) ``` ## How it works `elements` scans the page for all interactive elements (links, buttons, inputs, selects, etc.) — **including those inside shadow DOM** (web components). This means sites like Reddit, GitHub, and other modern SPAs that use shadow DOM are fully supported. The scan recursively pierces all shadow roots. Returns a compact numbered list: ``` [0] (link) Hacker News → https://news.ycombinator.com/news [1] (link) new → https://news.ycombinator.com/newest [2] (input:text) q [3] (button) Submit ``` Then `click 3` or `type 2 search query` — immediately actionable, no interpretation needed. **Auto-indexing:** `click` and `type` auto-index elements if not already indexed. You can skip calling `elements` first and go straight to `click`/`type` after `open`. Call `elements` explicitly when you need to see what's on the page. **After navigation or AJAX changes:** Elements get re-indexed automatically on next `click`/`type` if stamps are stale. For manual re-index, call `elements` again. **Real mouse events:** `click` uses CDP `Input.dispatchMouseEvent` (mousePressed + mouseReleased) instead of JS `.click()`. This triggers React/Vue/Angular synthetic event handlers that ignore plain `.click()` calls. Works reliably on SPAs like Instagram, GitHub, LinkedIn. ## File uploads `upload` uses CDP's `DOM.setFileInputFiles` to inject files directly into hidden `<input type="file">` elements — no OS file picker dialog. Works with Instagram, Twitter, any site with file uploads. ```bash bjs upload ~/photos/image.jpg # auto-finds input[type=file] bjs upload ~/docs/resume.pdf "input.file-drop" # specific selector ``` ## Token efficiency | Approach | Tokens per interaction | Notes | |----------|----------------------|-------| | **bjs** | ~50-200 | Indexed list, 1-line responses | | browser tool (snapshot) | ~2,000-5,000 | Full accessibility tree | | browser tool + thinking | ~3,000-8,000 | Plus reasoning to find refs | Over a 10-step flow: **~1,500 tokens (bjs) vs ~30,000-80,000 (browser tool)**. ## Typical flow ```bash bjs open https://example.com # Navigate bjs elements # See what's clickable bjs click 5 # Click element [5] bjs type 12 "hello world" # Type into element [12] bjs text # Read page content bjs screenshot /tmp/result.png # Verify visually ``` ## Shadow DOM support bjs automatically pierces shadow DOM boundaries. Sites built with web components (Reddit, GitHub, etc.) work out of the box — `elements`, `click`, `type`, and `text` all recurse into shadow roots. No special flags needed. ## Coordinate commands (iframes, captchas, overlays) When you can't use `click` by index — e.g. the target is inside a **cross-origin iframe** (captcha checkbox, payment form, OAuth widget) — use coordinate-based commands that dispatch real CDP Input events at the OS level. These bypass all DOM boundaries. **Workflow for clicking inside an iframe:** ```bash bjs iframe-rect 'iframe[title*="hCaptcha"]' # Get bounding box # Output: x=95 y=440 w=302 h=76 center=(246, 478) bjs click-xy 125 458 # Click checkbox position ``` `iframe-rect` returns the iframe's position on the page. Add offsets to target specific elements inside it (e.g. a checkbox is typically near the left side). **Other uses:** - `hover-xy` — trigger hover menus, tooltips that need mouse position - `drag-xy` — slider controls, drag-and-drop, canvas interactions - `click-xy --double` — double-click to select text, expand items - `click-xy --right` — context menus **When to use coordinate commands vs `click`:** - `click <index>` — always preferred when the element shows up in `elements` - `click-xy` — only when the target is inside a cross-origin iframe or otherwise unreachable by DOM indexing ## Tips - `elements` with a CSS selector narrows scope: `bjs elements ".modal"` - `eval` runs arbitrary JS and returns the result — use for custom extraction - `text` caps at 8KB — enough for most pages, won't blow up context - `html <selector>` caps at 10KB — for inspecting specific elements - Pipe through `grep` to filter: `bjs elements | grep -i "submit\|login"`

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 browser-js-1776419960 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 browser-js-1776419960 技能

通过命令行安装

skillhub install browser-js-1776419960

下载 Zip 包

⬇ 下载 browser-js v1.5.0

文件大小: 12.55 KB | 发布时间: 2026-4-17 19:26

v1.5.0 最新 2026-4-17 19:26
- Adds coordinate-based commands (`click-xy`, `hover-xy`, `drag-xy`, `iframe-rect`) for interacting with elements inside cross-origin iframes, captchas, overlays, and other non-indexable UI.
- Updates command list and documentation to explain when and how to use coordinate-based input vs. standard indexed element commands.
- Clarifies workflow and usage details for cross-origin iframe interaction, including coordinate calculation and tips.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部