System Dependencies

- uv must already be installed because this skill is executed with uv run, and uv installs the Python dependencies declared in src/main.py.
INLINECODE4 is needed for tts because the speech output is normalized and written as an .mp3 file through ffmpeg.
INLINECODE8 is needed for ocr because it performs the actual optical character recognition on scanned page images.
INLINECODE10 is also needed for ocr because it extracts page images from PDFs before those images are passed to tesseract; pdfimages comes from poppler.
INLINECODE15 is optional for convert because it can convert between many document formats when text-based conversion is possible.
INLINECODE17 is an optional alternative to pandoc for convert because it can handle document conversions that pandoc may not support well.

File Access And Network Behavior

- This skill operates on the file paths provided by the caller. It can read from and write to any host path the caller supplies; it is not limited to the OpenClaw workspace.
The /root/.openclaw/workspace/... paths in the command examples show where the skill entrypoint lives. They do not restrict which files the skill can access.
The tts command uses edge-tts, which sends the input text to an external text-to-speech service over the network to generate audio.
Do not use tts with sensitive or private text unless you are comfortable sending that text off-host.
All other commands run locally on the host, subject to the optional local binaries documented below.

Skill: PDF Toolkit

When to use

- User wants to extract text, tables, or images from a PDF.
User wants to get metadata or page count from a PDF.
User wants to merge, split, or rotate a PDF.
User wants to create a new PDF from plain text or Markdown.
User wants to read or write a DOCX file.
User wants to OCR a scanned PDF (requires tesseract on host).
User wants to convert text or a document to an MP3 audio file (requires ffmpeg on host).
User wants to convert between document formats (requires pandoc or libreoffice on host).
User wants to check which optional system tools are available.

When NOT to use

- User wants to view or render a PDF visually — use a PDF viewer.
User wants to fill in PDF form fields — this skill does not support AcroForms.
User wants to edit an existing PDF's text in-place — use a dedicated PDF editor.

Commands

Check available tools

CODEBLOCK0

Get PDF metadata and page count

CODEBLOCK1

Extract text from a PDF

CODEBLOCK2

Extract tables from a PDF

CODEBLOCK3

Extract images from a PDF

CODEBLOCK4

Merge PDFs

CODEBLOCK5

Split a PDF

CODEBLOCK6

Rotate pages in a PDF

CODEBLOCK7

Create a PDF from text

CODEBLOCK8

Read a DOCX file

CODEBLOCK9

Write a DOCX file

CODEBLOCK10

OCR a scanned PDF (requires tesseract)

CODEBLOCK11

Convert text or document to speech (requires ffmpeg)

CODEBLOCK12

Convert document formats (requires pandoc or libreoffice)

CODEBLOCK13

Examples

CODEBLOCK14

Chat Delivery

- When this skill is used in a chat interface that supports file attachments, such as Telegram, any generated output file should be sent back to the user as an attachment after successful creation or conversion.
This applies to commands that create files, including create-pdf, write-docx, extract-images, merge, split, rotate, tts, and convert.
If a temporary output file is created in the Claw runtime temporary folder for delivery, delete that temporary file immediately after the file has been sent successfully to the user.
Do not delete files that were written to a user-requested destination outside the Claw temporary folder.
If the chat environment cannot send file attachments, report the output path clearly instead of claiming the file was delivered.

Output

- Plain text with labeled sections separated by blank lines.
Errors are prefixed with Error:.
The doctor command shows a table of available and missing tools.

Notes

- uv run reads the inline # /// script dependency block in main.py and auto-installs Python packages in an isolated environment — no pip install or venv setup needed.
Core features (info, extract-text, extract-tables, merge, split, rotate, create-pdf, read-docx, write-docx) work with uv alone — no system binaries required.
OCR requires tesseract installed on the host (brew install tesseract / apt install tesseract-ocr). Also needs pdfimages from poppler (brew install poppler).
TTS requires ffmpeg installed on the host (brew install ffmpeg / apt install ffmpeg).
Document conversion requires pandoc or libreoffice on the host.
Run doctor first if you are unsure which features are available.

系统依赖

- 必须已安装 uv，因为该技能通过 uv run 执行，且 uv 会安装 src/main.py 中声明的 Python 依赖。
tts 需要 ffmpeg，因为语音输出会经过标准化处理，并通过 ffmpeg 写入为 .mp3 文件。
ocr 需要 tesseract，因为它负责对扫描页面图像执行实际的光学字符识别。
ocr 还需要 pdfimages，因为它负责在将页面图像传递给 tesseract 之前，从 PDF 中提取页面图像；pdfimages 来自 poppler。
convert 可选 pandoc，因为它能在基于文本的转换可行时，实现多种文档格式之间的转换。
convert 可选 libreoffice 作为 pandoc 的替代方案，因为它能处理 pandoc 可能支持不佳的文档转换。

文件访问与网络行为

- 该技能对调用者提供的文件路径进行操作。它可以读取和写入调用者提供的任何主机路径；不限于 OpenClaw 工作区。
命令示例中的 /root/.openclaw/workspace/... 路径仅表示技能入口点的位置。它们不限制技能可以访问的文件。
tts 命令使用 edge-tts，它会将输入文本通过网络发送到外部文本转语音服务以生成音频。
除非您愿意将敏感或私密文本发送到外部主机，否则请勿对这类文本使用 tts。
所有其他命令均在主机本地运行，受下文所述的可选本地二进制文件影响。

技能：PDF 工具包

使用时机

- 用户想要从 PDF 中提取文本、表格或图像。
用户想要获取 PDF 的元数据或页数。
用户想要合并、拆分或旋转 PDF。
用户想要从纯文本或 Markdown 创建新的 PDF。
用户想要读取或写入 DOCX 文件。
用户想要对扫描版 PDF 进行 OCR（需要主机上安装 tesseract）。
用户想要将文本或文档转换为 MP3 音频文件（需要主机上安装 ffmpeg）。
用户想要在文档格式之间进行转换（需要主机上安装 pandoc 或 libreoffice）。
用户想要检查哪些可选系统工具可用。

禁止使用时机

- 用户想要以可视化方式查看或渲染 PDF——请使用 PDF 查看器。
用户想要填写 PDF 表单字段——本技能不支持 AcroForms。
用户想要就地编辑现有 PDF 的文本——请使用专用 PDF 编辑器。

命令

检查可用工具

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor

获取 PDF 元数据和页数

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info

从 PDF 中提取文本

bash

所有页面

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text

指定页面（从1开始编号，逗号分隔或范围）

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text --pages 1,3,5-8

从 PDF 中提取表格

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables --pages 2-4

从 PDF 中提取图像

bash

默认保存到当前目录

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images --output-dir /path/to/output

合并 PDF

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge [ ...] --output merged.pdf

拆分 PDF

bash

拆分为单独的页面

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split --output-dir /path/to/output

提取页面范围到新的 PDF

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split --pages 2-5 --output extracted.pdf

旋转 PDF 中的页面

bash

将所有页面顺时针旋转90度

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate --degrees 90 --output rotated.pdf

旋转指定页面

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate --degrees 180 --pages 1,3 --output rotated.pdf

从文本创建 PDF

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --text Hello, world! --output hello.pdf uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --file input.txt --output document.pdf

读取 DOCX 文件

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx

写入 DOCX 文件

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --text Content here --output document.docx uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --file input.txt --output document.docx

对扫描版 PDF 进行 OCR（需要 tesseract）

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr --pages 1-3 --lang eng

将文本或文档转换为语音（需要 ffmpeg）

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text Hello, world! --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file input.txt --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file document.pdf --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text Hello --voice en-GB-SoniaNeural --output speech.mp3

转换文档格式（需要 pandoc 或 libreoffice）

bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py convert path> --output path>

示例

bash

检查 PDF

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info report.pdf

提取第1-3页的文本

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text report.pdf --pages 1-3

合并两个 PDF

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge a.pdf b.pdf --output combined.pdf

对扫描文档进行 OCR

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr scan.pdf

读取 Word 文档

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx report.docx

文本转 MP3

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text Welcome to the future. --output welcome.mp3

检查此主机上可用的工具

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor

聊天交付

- 当此技能在支持文件附件的聊天界面（如 Telegram）中使用时，任何生成的输出文件在成功创建或转换后，都应作为附件发送回用户。
这适用于创建文件的命令

pdf-toolkitPDF工具集

pdf-toolkit

System Dependencies

File Access And Network Behavior

Skill: PDF Toolkit

When to use

When NOT to use

Commands

Check available tools

Get PDF metadata and page count

Extract text from a PDF

Extract tables from a PDF

Extract images from a PDF

Merge PDFs

Split a PDF

Rotate pages in a PDF

Create a PDF from text

Read a DOCX file

Write a DOCX file

OCR a scanned PDF (requires tesseract)

Convert text or document to speech (requires ffmpeg)

Convert document formats (requires pandoc or libreoffice)

Examples

Chat Delivery

Output

Notes

系统依赖

文件访问与网络行为

技能：PDF 工具包

使用时机

禁止使用时机

命令

检查可用工具

获取 PDF 元数据和页数

从 PDF 中提取文本

所有页面

指定页面（从1开始编号，逗号分隔或范围）

从 PDF 中提取表格

从 PDF 中提取图像

默认保存到当前目录

合并 PDF

拆分 PDF

拆分为单独的页面

提取页面范围到新的 PDF

旋转 PDF 中的页面

将所有页面顺时针旋转90度

旋转指定页面

从文本创建 PDF

读取 DOCX 文件

写入 DOCX 文件

对扫描版 PDF 进行 OCR（需要 tesseract）

将文本或文档转换为语音（需要 ffmpeg）

转换文档格式（需要 pandoc 或 libreoffice）

示例

检查 PDF

提取第1-3页的文本

合并两个 PDF

对扫描文档进行 OCR

读取 Word 文档

文本转 MP3

检查此主机上可用的工具

聊天交付

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement