Fetch
Turn public URLs into usable local content.
Core Philosophy
- 1. Fetch only public web content.
- Prefer clean extracted text over noisy raw HTML.
- Save both the raw response and structured extraction locally.
- Keep a simple local job history so previous fetches are easy to inspect.
Runtime Requirements
- - Python 3 must be available as INLINECODE0
- No external packages required
Safety Boundaries
- - Public URLs only
- No login flows
- No cookies or browser automation
- No API keys or credentials
- No external uploads or cloud sync
- All fetched data is stored locally only
Local Storage
All data is stored under:
Key Workflows
- - Fetch URL: INLINECODE3
- Save cleaned output: INLINECODE4
- List history: INLINECODE5
- Show job details: INLINECODE6
Scripts
| Script | Purpose |
|---|
| INLINECODE7 | Initialize local storage |
| INLINECODE8 |
Fetch a public URL and extract content |
|
save_output.py | Save cleaned output with a custom title |
|
list_jobs.py | List previous fetch jobs |
|
show_job.py | Show one saved fetch job |
Fetch
将公共URL转化为可用的本地内容。
核心理念
- 1. 仅抓取公共网页内容。
- 优先提取干净文本,而非杂乱的原始HTML。
- 在本地同时保存原始响应和结构化提取结果。
- 维护简单的本地任务历史记录,便于检查之前的抓取结果。
运行环境要求
- - 需要Python 3,命令为python3
- 无需外部包
安全边界
- - 仅限公共URL
- 无需登录流程
- 无Cookie或浏览器自动化
- 无API密钥或凭证
- 无外部上传或云同步
- 所有抓取数据仅存储在本地
本地存储
所有数据存储于:
- - ~/.openclaw/workspace/memory/fetch/jobs.json
- ~/.openclaw/workspace/memory/fetch/pages/
关键工作流
- - 抓取URL:fetchurl.py --url https://example.com
- 保存清洗后的输出:saveoutput.py --url https://example.com --title Example
- 列出历史记录:listjobs.py
- 显示任务详情:showjob.py --id JOB-XXXX
脚本
| 脚本 | 用途 |
|---|
| initstorage.py | 初始化本地存储 |
| fetchurl.py |
抓取公共URL并提取内容 |
| save_output.py | 使用自定义标题保存清洗后的输出 |
| list_jobs.py | 列出之前的抓取任务 |
| show_job.py | 显示一个已保存的抓取任务 |