When to Use

- User provides a URL and wants to extract/read its content
Another skill needs to parse source material from a URL before generation
User says "parse this URL", "extract content from this link"
User says "解析链接", "提取内容"

When NOT to Use

- User already has text content and doesn't need URL parsing
User wants to generate audio/video content (not content extraction)
User wants to read a local file (use standard file reading tools)

Purpose

Extract and normalize content from URLs across supported platforms. Returns structured data including content body, metadata, and references. Useful as a preprocessing step for content generation skills or standalone content extraction.

Hard Constraints

- No shell scripts. Construct curl commands from the API reference files listed in Resources
Always read shared/authentication.md for API key and headers
Follow shared/common-patterns.md for polling, errors, and interaction patterns
URL must be a valid HTTP(S) URL
Always read config following shared/config-pattern.md before any interaction
Never save files to ~/Downloads/ or .listenhub/ — save to the current working directory

Use the AskUserQuestion tool for every multiple-choice step — do NOT print options as plain text. Ask one question at a time. Wait for the user's answer before proceeding to the next step. After collecting URL and options, confirm with the user before calling the extraction API.

Step -1: API Key Check

Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0.

If file doesn't exist — ask location, then create immediately:

mkdir -p ".listenhub/content-parser"
echo '{"autoDownload":true}' > ".listenhub/content-parser/config.json"
CONFIG_PATH=".listenhub/content-parser/config.json"
# (or $HOME/.listenhub/content-parser/config.json for global)

Then run Setup Flow below.

If file exists — read config, display summary, and confirm:

当前配置 (content-parser)：
  自动下载：{是 / 否}

Ask: "使用已保存的配置？" → 确认，直接继续 / 重新配置

Setup Flow (first run or reconfigure)

1. autoDownload: "自动保存提取的内容到当前目录？"

- "是（推荐）" → autoDownload: true - "否" → INLINECODE8

Save immediately:
CODEBLOCK2

Interaction Flow

Step 1: URL Input

Free text input. Ask the user:

What URL would you like to extract content from?

Step 2: Options (optional)

Ask if the user wants to configure extraction options:

CODEBLOCK3

If "Yes", ask follow-up questions:

- Summarize: "Generate a summary of the content?" (Yes/No)
Max Length: "Set maximum content length?" (Free text, e.g., "5000")
Twitter count (only if URL is Twitter/X profile): "How many tweets to fetch?" (1-100, default 20)

Step 3: Confirm & Extract

Summarize:

CODEBLOCK4

Wait for explicit confirmation before calling the API.

Workflow

1. Validate URL: Must be HTTP(S). Normalize if needed (see references/supported-platforms.md)
Build request body:

   {
     "source": {
       "type": "url",
       "uri": "{url}"
     },
     "options": {
       "summarize": true/false,
       "maxLength": 5000,
       "twitter": {
         "count": 50
       }
     }
   }

Omit options if user chose defaults.

3. Submit (foreground): POST /v1/content/extract → extract INLINECODE12
Tell the user extraction is in progress
Poll (background): Run the following exact bash command with run_in_background: true and timeout: 300000. Note: status field is .data.status (not processStatus), interval is 5s, values are processing/completed/failed:

CODEBLOCK6

6. When notified, download and present result:

If autoDownload is true:
- Write {taskId}-extracted.md to the current directory — full extracted content in markdown
- Write {taskId}-extracted.json to the current directory — full raw API response data

CODEBLOCK7

Present:
CODEBLOCK8

7. Show a preview of the extracted content (first ~500 chars)
Offer to use content in another skill (e.g. /podcast, /tts)

Estimated time: 10-30 seconds depending on content size and platform.

API Reference

- Content extract: INLINECODE26
Supported platforms: INLINECODE27
Polling: shared/common-patterns.md § Async Polling
Error handling: shared/common-patterns.md § Error Handling
Config pattern: INLINECODE30

Example

User: "Parse this article: https://en.wikipedia.org/wiki/Topology"

Agent workflow:

1. URL: INLINECODE31
Options: defaults (omit options)
Submit extraction

CODEBLOCK9

4. Poll until complete:

CODEBLOCK10

5. Present extracted content preview and offer next actions.

User: "Extract recent tweets from @elonmusk, get 50 tweets"

Agent workflow:

1. URL: INLINECODE32
Options: INLINECODE33
Submit extraction

CODEBLOCK11

4. Poll until complete, present results.

何时使用

- 用户提供URL并希望提取/读取其内容
另一个技能在生成前需要从URL解析源材料
用户说解析这个URL、从该链接提取内容
用户说解析链接、提取内容

何时不使用

- 用户已有文本内容，无需URL解析
用户希望生成音频/视频内容（非内容提取）
用户希望读取本地文件（使用标准文件读取工具）

目的

从支持的平台URL中提取并规范化内容。返回结构化数据，包括内容正文、元数据和引用。可作为内容生成技能的预处理步骤或独立的内容提取工具。

硬性约束

- 不使用shell脚本。根据资源中列出的API参考文件构建curl命令
始终读取shared/authentication.md获取API密钥和请求头
遵循shared/common-patterns.md中的轮询、错误处理和交互模式
URL必须是有效的HTTP(S) URL
在任何交互前始终按照shared/config-pattern.md读取配置
切勿将文件保存到~/Downloads/或.listenhub/——保存到当前工作目录

对于每个多选步骤，使用AskUserQuestion工具——不要以纯文本形式打印选项。一次只问一个问题。等待用户回答后再进入下一步。收集URL和选项后，在调用提取API前与用户确认。

步骤 -1：API密钥检查

按照shared/config-pattern.md § API密钥检查执行。如果密钥缺失，立即停止。

步骤 0：配置设置

按照shared/config-pattern.md步骤0执行。

如果文件不存在——询问位置，然后立即创建：
bash
mkdir -p .listenhub/content-parser
echo {autoDownload:true} > .listenhub/content-parser/config.json
CONFIG_PATH=.listenhub/content-parser/config.json

（或使用$HOME/.listenhub/content-parser/config.json作为全局配置）

然后运行下面的设置流程。

如果文件存在——读取配置，显示摘要并确认：

当前配置 (content-parser)：
自动下载：{是 / 否}

询问：使用已保存的配置？ → 确认，直接继续 / 重新配置

设置流程（首次运行或重新配置）

1. autoDownload：自动保存提取的内容到当前目录？

- 是（推荐） → autoDownload: true - 否 → autoDownload: false

立即保存：
bash
NEW_CONFIG=$(echo $CONFIG | jq --argjson dl {true/false} . + {autoDownload: $dl})
echo $NEWCONFIG > $CONFIGPATH
CONFIG=$(cat $CONFIG_PATH)

交互流程

步骤 1：URL输入

自由文本输入。询问用户：

您想从哪个URL提取内容？

步骤 2：选项（可选）

询问用户是否要配置提取选项：

问题：您想配置提取选项吗？
选项：
- 不，使用默认设置 — 使用默认设置提取
- 是，配置选项 — 设置摘要、最大长度或Twitter推文数量

如果选择是，询问后续问题：

- 摘要：生成内容摘要？（是/否）
最大长度：设置最大内容长度？（自由文本，例如5000）
Twitter数量（仅当URL是Twitter/X个人资料时）：获取多少条推文？（1-100，默认20）

步骤 3：确认并提取

摘要：

准备提取内容：

URL：{url}
选项：{summarize: true, maxLength: 5000, twitter.count: 50} / 默认

继续吗？

在调用API前等待明确确认。

工作流程

1. 验证URL：必须是HTTP(S)。必要时进行规范化（参见references/supported-platforms.md）
构建请求体：

json { source: { type: url, uri: {url} }, options: { summarize: true/false, maxLength: 5000, twitter: { count: 50 } } }

如果用户选择默认设置，则省略options。

3. 提交（前台）：POST /v1/content/extract → 提取taskId
告知用户提取正在进行中
轮询（后台）：使用runinbackground: true和timeout: 300000运行以下精确的bash命令。注意：状态字段是.data.status（不是processStatus），间隔为5秒，值为processing/completed/failed：

bash
TASK_ID=<步骤3中的ID>
for i in $(seq 1 60); do
RESULT=$(curl -sS https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID \
-H Authorization: Bearer $LISTENHUBAPIKEY 2>/dev/null)
STATUS=$(echo $RESULT | tr -d \000-\037\177 | jq -r .data.status // processing)
case $STATUS in
completed) echo $RESULT; exit 0 ;;
failed) echo 失败：$RESULT >&2; exit 1 ;;
*) sleep 5 ;;
esac
done
echo 超时 >&2; exit 2

6. 收到通知后，下载并展示结果：

如果autoDownload为true：
- 将{taskId}-extracted.md写入当前目录——完整的提取内容（markdown格式）
- 将{taskId}-extracted.json写入当前目录——完整的原始API响应数据

bash
echo $CONTENTMD > ${TASKID}-extracted.md
echo $RESULT > ${TASK_ID}-extracted.json

展示：

内容提取完成！

来源：{url}
标题：{metadata.title}
长度：~{字符数} 字符
消耗积分：{credits}

已保存到当前目录：
{taskId}-extracted.md
{taskId}-extracted.json

7. 显示提取内容的预览（前约500个字符）
提供在其他技能中使用内容的选项（例如/podcast、/tts）

预计时间：10-30秒，取决于内容大小和平台。

API参考

- 内容提取：shared/api-content-extract.md
支持的平台：references/supported-platforms.md
轮询：shared/common-patterns.md § 异步轮询
错误处理：shared/common-patterns.md § 错误处理
配置模式：shared/config-pattern.md

示例

用户：解析这篇文章：https://en.wikipedia.org/wiki/Topology

Agent工作流程：

1. URL：https://en.wikipedia.org/wiki/Topology
选项：默认（省略options）
提交提取

bash
curl -sS -X POST https://api.marswave.ai/openapi/v1/content/extract \
-H Authorization: Bearer $LISTENHUBAPIKEY \
-H Content-Type: application/json \
-d {
source: {
type: url,
uri: https://en.wikipedia.org/wiki/Topology
}
}

4. 轮询直至完成：

bash
curl -sS https://api.marswave.ai/openapi/v1/content/extract/69a7dac700cf95938f86d9bb \
-H Authorization: Bearer $LISTENHUBAPIKEY

5. 展示提取内容预览并提供后续操作选项。

用户：提取@elonmusk的最新推文，获取50条推文

Agent工作流程：

1. URL：https://x.com/elonmusk
选项：{twitter: {count: 50}}
提交提取

4. 轮询直至完成，

content-parser内容解析器

content-parser

When to Use

When NOT to Use

Purpose

Hard Constraints

Step -1: API Key Check

Step 0: Config Setup

Setup Flow (first run or reconfigure)

Interaction Flow

Step 1: URL Input

Step 2: Options (optional)

Step 3: Confirm & Extract

Workflow

API Reference

Example

何时使用

何时不使用

目的

硬性约束

步骤 -1：API密钥检查

步骤 0：配置设置

（或使用$HOME/.listenhub/content-parser/config.json作为全局配置）

设置流程（首次运行或重新配置）

交互流程

步骤 1：URL输入

步骤 2：选项（可选）

步骤 3：确认并提取

工作流程

API参考

示例

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

content-parser内容解析器

content-parser

When to Use

When NOT to Use

Purpose

Hard Constraints

Step -1: API Key Check

Step 0: Config Setup

Setup Flow (first run or reconfigure)

Interaction Flow

Step 1: URL Input

Step 2: Options (optional)

Step 3: Confirm & Extract

Workflow

API Reference

Example

何时使用

何时不使用

目的

硬性约束

步骤 -1：API密钥检查

步骤 0：配置设置

（或使用$HOME/.listenhub/content-parser/config.json作为全局配置）

设置流程（首次运行或重新配置）

交互流程

步骤 1：URL输入

步骤 2：选项（可选）

步骤 3：确认并提取

工作流程

API参考

示例

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement