Doc-Scan — DEPRECATED

This skill is deprecated. All functionality has been merged into doc-process v4.0.0 with a significantly improved scanner engine. Please use doc-process instead.
To scan a document photo: install doc-process and say "scan this photo", "correct perspective", "dewarp this document", or any equivalent phrase.

Doc-Scan — Document Scanner Skill (archived)

Converts a photo of a document (whiteboard, printed page, handwritten note, form, receipt, book page, etc.) into a clean scanned-looking image with perspective correction and enhancement.

Step 1 — Validate the Input

Read the provided image visually. Assess:

Check	Yes/No	Notes
Is this an image file?		.jpg, .jpeg, .png, .heic, .webp, .bmp, .tiff
Does the image contain a document?

Non-Document Detection

If the image does not appear to contain a document, respond: CODEBLOCK0

Do not proceed with scanning if this check fails.

Step 2 — Pre-Scan Assessment

Report what you see before scanning:

CODEBLOCK1

Step 3 — Run the Scanner Script

CODEBLOCK2

Common Options

# Black and white output (best for text documents)
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode bw

# Color-preserved output (best for forms, diagrams, colored content)
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode color

# Grayscale output (middle ground)
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode gray

# Output as PDF
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.pdf --format pdf

# Multiple images into one PDF (multi-page scan)
python skills/doc-scan/scripts/doc_scanner.py --input page1.jpg page2.jpg page3.jpg --output document.pdf --format pdf

# Manual corner specification (if auto-detection fails)
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --corners "50,30 800,20 820,1100 40,1120"

# High-resolution output
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --dpi 300

# Skip perspective correction (if photo is already flat)
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --no-warp

Step 4 — Interpret Script Output

The script outputs a JSON status block to stderr. Parse and report to user:

CODEBLOCK4

Status Handling

"status": "success": Report completion with key stats.

"corners_detected": false: Auto-detection failed. Offer:

- "Auto edge-detection could not find the document corners. I can try with manual corner hints — please describe approximately where the four corners of the document appear in the photo (e.g., top-left at about 10% from left and 5% from top)."
Or: offer --no-warp mode to at least apply enhancement without perspective correction

warnings array: Report any warnings to user — e.g., "Low contrast image", "Detected significant blur", "Partial document visible"

Step 5 — Post-Scan Quality Check

After the script completes, read the output image visually and assess:

Quality Check	Pass / Fail	Notes
Document edges are straight		No barrel distortion remaining
Text is legible

If any check fails, report the issue and offer:

- Re-run with different settings (different mode, manual corners, contrast level)
Re-photograph tips (see Step 7)

Step 6 — Output Report

CODEBLOCK5

Step 7 — Multi-Page Documents

If the user provides multiple photos (or a folder of images):

1. Process each image individually
Sort by filename or user-specified order
Combine into a single PDF:

python skills/doc-scan/scripts/doc_scanner.py \
  --input page1.jpg page2.jpg page3.jpg \
  --output document.pdf --format pdf

4. Report: "3-page document scanned and saved to document.pdf"

For very large batches (10+ pages), process in batches of 10 and combine.

Step 8 — Photography Tips (on poor quality input)

If the input image quality is poor or auto-detection fails, provide tips:

CODEBLOCK7

Step 9 — Integration with Doc-Process

After a successful scan, offer to process the scanned document with doc-process:

CODEBLOCK8

If the user confirms, pass the scanned output file to the appropriate doc-process mode.

Step 10 — Timeline Logging

After a successful scan, silently log to the timeline:

python skills/doc-process/scripts/timeline_manager.py add \
  --type "Doc Scan" \
  --source "<input filename>" \
  --summary "<N> page(s) scanned, perspective corrected, saved as <output filename>"

General Rules

- Never process the photo if it does not contain a document — explain what was detected instead
Always report the detected document type so the user can confirm before scanning
Auto-detect corners when possible; fall back gracefully to manual or no-warp mode
Default output mode: bw for text documents, color for anything with color content
Default output format: PNG (lossless); PDF only if explicitly requested or for multi-page
Default DPI: 300 (print quality); 150 for screen-only use

Doc-Scan — 已弃用

此技能已弃用。 所有功能已合并至 doc-process v4.0.0，并配备了显著改进的扫描引擎。请改用 doc-process。
如需扫描文档照片：安装 doc-process 并说出扫描这张照片、校正透视、矫正文档或任何等效指令。

Doc-Scan — 文档扫描技能（已归档）

将文档照片（白板、打印页、手写笔记、表格、收据、书页等）转换为具有透视校正和增强效果的干净扫描图像。

第1步 — 验证输入

通过视觉方式读取提供的图像。评估：

检查项	是/否	备注
是否为图像文件？		.jpg、.jpeg、.png、.heic、.webp、.bmp、.tiff
图像中是否包含文档？

非文档检测

如果图像似乎不包含文档，请回复：

⚠ 此图像似乎不包含文档。

我检测到：[图像内容的简要描述——例如风景照片、人物肖像、空白墙壁]

Doc-Scan 最适合处理：

- 打印文档（表格、信件、报告）
手写笔记或白板
收据、发票或名片
书籍或杂志页面
从上方或一定角度拍摄的任何平面文档

如果您本意是上传文档照片，请尝试在光线更好的条件下重新拍摄，并确保文档清晰可见。如果您想为此图像进行其他处理，我也可以提供帮助。

如果此项检查未通过，请勿继续扫描。

第2步 — 扫描前评估

在扫描前报告您所看到的内容：

文档照片评估

属性	检测值
文档类型	[例如：打印信件、手写笔记、收据、表格]
方向

推荐增强处理：

- [x] 透视校正
[x] 背景移除 / 边缘裁剪
[ ] 二值化（黑白）—— 适用于纯文本
[x] 对比度增强
[x] 阴影去除
[ ] 色彩保留 —— 适用于包含彩色内容的文档

第3步 — 运行扫描脚本

bash
python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png

常用选项

bash

黑白输出（最适合文本文档）

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode bw

保留色彩输出（最适合表格、图表、彩色内容）

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode color

灰度输出（折中方案）

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --mode gray

输出为PDF

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.pdf --format pdf

多张图像合并为一个PDF（多页扫描）

python skills/doc-scan/scripts/doc_scanner.py --input page1.jpg page2.jpg page3.jpg --output document.pdf --format pdf

手动指定角点（如果自动检测失败）

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --corners 50,30 800,20 820,1100 40,1120

高分辨率输出

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --dpi 300

跳过透视校正（如果照片已平直）

python skills/doc-scan/scripts/doc_scanner.py --input photo.jpg --output scanned.png --no-warp

第4步 — 解读脚本输出

脚本向 stderr 输出一个 JSON 状态块。解析并向用户报告：

json
{
status: success,
corners_detected: true,
corners: [[50,30],[800,20],[820,1100],[40,1120]],
warp_applied: true,
enhancement_mode: bw,
input_size: [3024, 4032],
output_size: [2480, 3508],
output_dpi: 300,
pages: 1,
output_file: scanned.png,
warnings: []
}

状态处理

status: success：报告完成情况并附上关键统计数据。

corners_detected: false：自动检测失败。提供以下选项：

- 自动边缘检测无法找到文档角点。我可以尝试使用手动角点提示——请大致描述文档四个角在照片中的位置（例如，左上角距左侧约10%、距顶部约5%）。
或者：提供 --no-warp 模式，至少在不进行透视校正的情况下应用增强处理

warnings 数组：向用户报告任何警告——例如低对比度图像、检测到明显模糊、文档部分可见

第5步 — 扫描后质量检查

脚本完成后，通过视觉方式读取输出图像并评估：

质量检查项	通过/未通过	备注
文档边缘平直		无残留桶形畸变
文字清晰可读

如果任何检查项未通过，请报告问题并提供：

- 使用不同设置重新运行（不同模式、手动角点、对比度级别）
重新拍摄建议（参见第7步）

第6步 — 输出报告

扫描完成 ✓

属性	值
输出文件	scanned.png
输出尺寸

2480 × 3508 像素（A4，300 DPI） | | 模式 | 黑白 | | 透视校正 | 已应用 | | 阴影去除 | 已应用 | | 处理时间 | ~2.3秒 |

已应用的增强处理

- 边缘检测与四角提取
透视变形至标准A4尺寸
自适应阈值处理（Sauvola方法）实现干净黑白文字
通过背景归一化进行阴影补偿
裁剪边框至文档边缘

处理前 → 处理后

[原始照片] → [扫描输出] （两者均可通过其文件路径获取）

第7步 — 多页文档

如果用户提供多张照片（或一个图像文件夹）：

1. 单独处理每张图像
按文件名或用户指定的顺序排序
合并为单个PDF：

bash python skills/doc-scan/scripts/doc_scanner.py \ --input page1.jpg page2.jpg page3.jpg \ --output document.pdf --format pdf

4. 报告：3页文档已扫描并保存至 document.pdf

对于非常大的批次（10页以上），按每批10页处理并合并。

第8步 — 拍摄技巧（针对低质量输入）

如果输入图像质量较差或自动检测失败，请提供技巧：

获得更好扫描效果的技巧

光照：

- 在明亮均匀的光线下扫描（避免直射阳光产生眩光）
避免手或身体投下阴影
光线充足的室内环境效果最佳

相机位置：

- 尽可能将相机直接置于文档上方
保持相机与文档表面平行
完整文档应可见，并留有少量边框

背景：

- 将文档放在对比鲜明的背景上（白纸用深色桌面，深色纸张用白色表面）
避免有图案或杂

doc-scan文档扫描

doc-scan

Doc-Scan — DEPRECATED

Doc-Scan — Document Scanner Skill (archived)

Step 1 — Validate the Input

Non-Document Detection

Step 2 — Pre-Scan Assessment

Step 3 — Run the Scanner Script

Common Options

Step 4 — Interpret Script Output

Status Handling

Step 5 — Post-Scan Quality Check

Step 6 — Output Report

Step 7 — Multi-Page Documents

Step 8 — Photography Tips (on poor quality input)

Step 9 — Integration with Doc-Process

Step 10 — Timeline Logging

General Rules

Doc-Scan — 已弃用

Doc-Scan — 文档扫描技能（已归档）

第1步 — 验证输入

非文档检测

第2步 — 扫描前评估

文档照片评估

第3步 — 运行扫描脚本

常用选项

黑白输出（最适合文本文档）

保留色彩输出（最适合表格、图表、彩色内容）

灰度输出（折中方案）

输出为PDF

多张图像合并为一个PDF（多页扫描）

手动指定角点（如果自动检测失败）

高分辨率输出

跳过透视校正（如果照片已平直）

第4步 — 解读脚本输出

状态处理

第5步 — 扫描后质量检查

第6步 — 输出报告

扫描完成 ✓

已应用的增强处理

处理前 → 处理后

第7步 — 多页文档

第8步 — 拍摄技巧（针对低质量输入）

获得更好扫描效果的技巧

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement