Tabstack — Web & PDF Tools for AI Agents
Tabstack is a web execution API for reading, extracting, transforming, and
interacting with web pages and PDF documents. It handles JavaScript-rendered
sites, structured data extraction, AI-powered content transformation, and
multi-step browser automation.
Setup (first use only)
Install dependencies from the skill's directory:
CODEBLOCK0
Where <skill-dir> is the directory containing this SKILL.md file.
Operations
All operations are run via the exec tool. First cd into the skill directory,
then run the command with a relative path:
CODEBLOCK1
Execution strategy: Always run tabstack commands in the foreground —
call exec and wait for completion. Background execution requires manual
polling and is unreliable.
JSON arguments: Any JSON argument (schema, --data) can be passed inline
or as a file path prefixed with @ (e.g. @/tmp/schema.json). Use file
paths for complex schemas to avoid shell quoting issues.
1. extract-markdown — Read a page or PDF as clean Markdown
Best for: reading articles, documentation, PDF reports. This is the cheapest
operation — prefer it when you just need to read content.
CODEBLOCK2
Returns the page/PDF as Markdown. For web pages, includes YAML frontmatter
metadata (title, author, etc.).
Optional flags:
- -
--metadata — return metadata as a separate JSON block - INLINECODE8 — bypass caching and get fresh content
- INLINECODE9 — fetch from a specific country (ISO 3166-1 alpha-2, e.g.
US, GB)
2. extract-json — Pull structured data from a page or PDF
Best for: prices, product details, tables, invoices, any document with
predictable repeating structure.
Without a schema (Tabstack infers structure):
CODEBLOCK3
With a JSON Schema (inline or from file):
CODEBLOCK4
Optional flags: --nocache, --geo CC.
See references/examples.md for common JSON schema
patterns (products, articles, events, tables, contacts).
3. generate — Transform web/PDF content into a custom JSON shape
Best for: summaries, categorization, sentiment analysis, reformatting. Unlike
extract-json (which pulls existing data), generate uses an LLM to create
new content. May be slower due to LLM processing.
CODEBLOCK5
Optional flags: --nocache, --geo CC.
Example — categorise and summarise HN posts:
CODEBLOCK6
See references/examples.md for more schema and
instruction examples.
4. automate — Multi-step browser task in natural language
Best for: tasks needing real browser interaction — clicking, navigating across
pages, filling forms. Does NOT support PDFs or --geo.
CODEBLOCK7
Optional flags:
- -
--url <url> — starting URL for the task. When omitted, automate uses its
own built-in web search to find relevant pages — this can be cheaper and
faster than
research for simple factual questions.
- -
--max-iterations N — limit steps (default 50, range 1-100) - INLINECODE25 — safety constraints (e.g.
"browse only, don't submit forms") - INLINECODE27 — JSON context for form filling
Timeout: May take 30-120 seconds. Use at least 420s exec timeout.
Example — fill a contact form with guardrails:
CODEBLOCK8
Example — simple search (no URL, uses built-in web search):
CODEBLOCK9
5. research — AI-powered deep web research
Searches the web, analyzes multiple sources, and synthesizes a comprehensive
answer with citations. Unlike the other operations, research doesn't need
a URL — you give it a question and it finds the answers.
For simple factual lookups, automate without a --url may be faster and
cheaper. Use research when you need depth, multiple perspectives, or
cited sources.
Use cases:
- - Complex questions that need multiple sources ("What are the pros and cons
of Rust vs Go for CLI tools?")
- - Fact-checking and verification ("Is it true that...")
- Current events and recent information
- Topic deep-dives and literature reviews
- Competitive research ("Compare X vs Y vs Z")
CODEBLOCK10
Optional flags:
- -
--mode fast|balanced — fast for quick single-source answers, INLINECODE35
(default) for deeper multi-source research with more iterations
- -
--geo CC — research from a specific country's perspective
Timeout: May take 60-120 seconds. Use at least 420s exec timeout.
Example — quick factual lookup:
CODEBLOCK11
Example — deep research:
CODEBLOCK12
Reference: Examples & Recipes
Read references/examples.md when you need to:
- - Build a JSON schema for
extract-json — patterns for products, articles,
events, tables, contacts, invoices
- - Write effective instructions for
generate — recipes for summarization,
sentiment analysis, competitive analysis, content digests
- - Recover from a failed attempt — if a command doesn't produce good
results, check for a better approach
Choosing the Right Operation
| Operation | Use when... | Cost | Timeout |
|---|
| INLINECODE39 | Read/summarise a page or PDF | Lowest | 60s |
| INLINECODE40 |
Structured data from a page or PDF | Medium | 60s |
|
generate | AI-transformed content from a page or PDF | Medium | 60s |
|
research | Answers from multiple web sources | Medium | 420s |
|
automate | Browser interaction or simple web search (no PDF) | Highest | 420s |
Prefer cheaper operations when they suffice. Use extract-markdown for
simple reading. Only use automate when the task requires clicking,
navigating, or form interaction.
Inform the user before triggering multiple automate calls — they are the
most expensive.
Error Handling
| Error | Meaning |
|---|
| INLINECODE47 | TABSTACKAPIKEY is missing or invalid |
| INLINECODE48 |
URL is malformed or page is unreachable |
|
400 Bad Request | Malformed request — check arguments |
| No output | Task timed out or page blocked automation |
On automate failures, retry once. If it fails again, fall back to
extract-markdown for read-only tasks.
Environment Configuration
This skill requires a TABSTACK_API_KEY to function. Get one from
tabstack.ai (Mozilla-backed, free tier available).
Set the key via the CLI:
CODEBLOCK13
The skill will exit with an error if the key is not set.
Security & Privacy
- - API key: This skill requires a
TABSTACK_API_KEY. All requests are
sent to the Tabstack API (
api.tabstack.ai) using this key for
authentication. The key is read from the environment, not hardcoded.
- - Data sent to Tabstack: URLs you process, JSON schemas, instructions,
and any
--data payloads are sent to Tabstack's servers for processing.
Do not pass passwords, authentication tokens, or other secrets via
--data unless you explicitly trust the Tabstack service.
- - Browser automation: The
automate command drives a remote browser
that can click, navigate, fill forms, and submit data. Use
--guardrails
to constrain what the browser can do (e.g.
"browse only, don't submit
forms").
- - Dependencies: This skill installs
@tabstack/sdk and tsx from npm.
A
package-lock.json is provided for reproducible installs.
- - No persistence: The skill does not modify agent configuration, store
credentials, or run outside of its own directory.
Tabstack — AI智能体的网页与PDF工具
Tabstack是一个网页执行API,用于读取、提取、转换网页和PDF文档,并与它们进行交互。它支持JavaScript渲染的网站、结构化数据提取、AI驱动的内容转换以及多步骤浏览器自动化。
设置(仅首次使用)
从技能目录安装依赖:
bash
cd <技能目录> && npm install
其中<技能目录>是包含此SKILL.md文件的目录。
操作
所有操作均通过exec工具运行。首先cd进入技能目录,然后使用相对路径运行命令:
bash
<技能目录>/scripts/run.sh <命令> <参数>
执行策略: 始终在前台运行tabstack命令——调用exec并等待完成。后台执行需要手动轮询且不可靠。
JSON参数: 任何JSON参数(schema、--data)都可以内联传递,或作为以@为前缀的文件路径传递(例如@/tmp/schema.json)。对于复杂schema,请使用文件路径以避免shell引号问题。
1. extract-markdown — 将页面或PDF读取为干净的Markdown
最适合:阅读文章、文档、PDF报告。这是最便宜的操作——当你只需要阅读内容时优先使用。
bash
<技能目录>/scripts/run.sh extract-markdown
将页面/PDF作为Markdown返回。对于网页,包含YAML前置元数据(标题、作者等)。
可选标志:
- - --metadata — 将元数据作为单独的JSON块返回
- --nocache — 绕过缓存获取最新内容
- --geo CC — 从特定国家获取(ISO 3166-1 alpha-2,例如US、GB)
2. extract-json — 从页面或PDF提取结构化数据
最适合:价格、产品详情、表格、发票、任何具有可预测重复结构的文档。
无schema时(Tabstack推断结构):
bash
<技能目录>/scripts/run.sh extract-json
使用JSON Schema时(内联或从文件):
bash
<技能目录>/scripts/run.sh extract-json @/tmp/schema.json
可选标志:--nocache、--geo CC。
参见references/examples.md了解常见JSON schema模式(产品、文章、事件、表格、联系人)。
3. generate — 将网页/PDF内容转换为自定义JSON形状
最适合:摘要、分类、情感分析、格式重排。与extract-json(提取现有数据)不同,generate使用LLM创建新内容。由于LLM处理,可能较慢。
bash
<技能目录>/scripts/run.sh \
generate <指令>
可选标志:--nocache、--geo CC。
示例——分类和总结HN帖子:
bash
<技能目录>/scripts/run.sh \
generate https://news.ycombinator.com \
{type:object,properties:{stories:{type:array,items:{type:object,properties:{title:{type:string},category:{type:string},summary:{type:string}}}}}} \
对于每个故事,分类为技术/商业/科学/其他,并写一句话摘要
参见references/examples.md获取更多schema和指令示例。
4. automate — 自然语言的多步骤浏览器任务
最适合:需要真实浏览器交互的任务——点击、跨页面导航、填写表单。不支持PDF或--geo。
bash
<技能目录>/scripts/run.sh \
automate <自然语言任务> --url
可选标志:
- - --url — 任务的起始URL。省略时,automate使用内置网页搜索查找相关页面——对于简单事实性问题,这可能比research更便宜、更快。
- --max-iterations N — 限制步骤数(默认50,范围1-100)
- --guardrails ... — 安全约束(例如仅浏览,不提交表单)
- --data {key:val}|@file — 用于表单填写的JSON上下文
超时: 可能需要30-120秒。请使用至少420秒的exec超时。
示例——使用安全约束填写联系表单:
bash
<技能目录>/scripts/run.sh \
automate 用我的信息填写联系表单 \
--url https://example.com/contact \
--data {name:Alex,email:alex@example.com,message:Hello} \
--guardrails 仅填写并提交联系表单,不要导航到其他页面
示例——简单搜索(无URL,使用内置网页搜索):
bash
<技能目录>/scripts/run.sh \
automate 查找MacBook Air M4的当前价格
5. research — AI驱动的深度网页研究
搜索网页,分析多个来源,并综合生成带引用的全面答案。与其他操作不同,research不需要URL——你提出问题,它找到答案。
对于简单事实查询,不带--url的automate可能更快、更便宜。当你需要深度、多角度或引用来源时,使用research。
使用场景:
- - 需要多个来源的复杂问题(Rust与Go用于CLI工具的优缺点是什么?)
- 事实核查和验证(是否真的...)
- 当前事件和最新信息
- 主题深度探究和文献综述
- 竞争研究(比较X、Y和Z)
bash
<技能目录>/scripts/run.sh research <查询>
可选标志:
- - --mode fast|balanced — fast用于快速单源回答,balanced(默认)用于更深入的多源研究,迭代次数更多
- --geo CC — 从特定国家视角进行研究
超时: 可能需要60-120秒。请使用至少420秒的exec超时。
示例——快速事实查询:
bash
<技能目录>/scripts/run.sh research Node.js当前的LTS版本是什么? --mode fast
示例——深度研究:
bash
<技能目录>/scripts/run.sh research 比较实时Web应用程序的WebSocket、SSE和长轮询
参考:示例与配方
当你需要以下内容时,阅读references/examples.md:
- - 构建JSON schema用于extract-json——产品、文章、事件、表格、联系人、发票的模式
- 编写有效指令用于generate——摘要、情感分析、竞争分析、内容摘要的配方
- 从失败尝试中恢复——如果命令未产生良好结果,检查是否有更好的方法
选择正确的操作
| 操作 | 使用场景... | 成本 | 超时 |
|---|
| extract-markdown | 阅读/总结页面或PDF | 最低 | 60s |
| extract-json |
从页面或PDF提取结构化数据 | 中等 | 60s |
| generate | 从页面或PDF进行AI转换的内容 | 中等 | 60s |
| research | 从多个网页来源获取答案 | 中等 | 420s |
| automate | 浏览器交互或简单网页搜索(不支持PDF) | 最高 | 420s |
在足够的情况下优先使用更便宜的操作。简单阅读使用extract-markdown。仅在任务需要点击、导航或表单交互时使用automate。
在触发多个automate调用之前通知用户——它们是最昂贵的。
错误处理
| 错误 | 含义 |
|---|
| 401 Unauthorized | TABSTACKAPIKEY缺失或无效 |
| 422 Unprocessable |
URL格式错误或页面无法访问 |
| 400 Bad Request | 请求格式错误——检查参数 |
| 无输出 | 任务超时或页面阻止了自动化 |
automate失败时,重试一次。如果再次失败,对于只读任务回退到extract-markdown。
环境配置
此技能需要TABSTACKAPIKEY才能运行。从tabstack.ai获取(Mozilla支持,提供免费层级)。
通过CLI设置密钥:
bash
openclaw config set env.TABSTACKAPIKEY your-key-here
如果未设置密钥,技能将以错误退出。
安全与隐私
- - API密钥:此技能需要TABSTACKAPIKEY。所有请求都使用此密钥发送到Tabstack API(api.tabstack.ai)进行身份验证。密钥从环境读取,而非硬编码。
- - 发送到Tabstack的数据:你处理的URL、JSON