Scan to Markdown - OCR for Scanned Docs
Extract text from scanned documents and images using OCR via MinerU Open API. No API key required.
Quick Start
CODEBLOCK0
Language Rule
You MUST reply to the user in the SAME language they use. This is non-negotiable.
Capabilities
- - OCR for scanned PDFs, photographed documents, images
- Supports PDF, PNG, JPG, WebP, BMP, TIFF
- Supports both local files and URLs directly
- Language hint with
--language (default: ch, use en for English) - No API key, no signup, no authentication
- Max 10MB / 20 pages per document
When to Use
- - User asks to "OCR" a document or image
- User has a scanned PDF that needs text extraction
- User shares a photo of a page and wants the text
- User mentions "scan", "handwriting", or "recognize text"
CLI Reference
Run mineru-open-api flash-extract --help for all available options.
Data Privacy
- -
flash-extract uploads the document to MinerU's cloud API for processing and returns the result. No account or API key is required. - Documents are processed in real-time and are not stored after extraction.
- For details, see https://mineru.net
Notes
- - Best results with clear, high-resolution scans
- For higher precision OCR with full layout preservation, use
mineru-open-api extract --ocr (requires auth via mineru-open-api auth) - If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli
扫描转Markdown - 扫描文档的OCR识别
通过MinerU开放API从扫描文档和图像中提取文本。无需API密钥。
快速开始
bash
扫描转Markdown - 扫描文档的OCR识别
mineru-open-api flash-extract scanned.pdf
扫描转Markdown - 扫描文档的OCR识别
mineru-open-api flash-extract page-photo.jpg
扫描转Markdown - 扫描文档的OCR识别
mineru-open-api flash-extract https://example.com/scanned.pdf
扫描转Markdown - 扫描文档的OCR识别
mineru-open-api flash-extract scanned.pdf --language en
扫描转Markdown - 扫描文档的OCR识别
mineru-open-api flash-extract scanned.pdf -o ./output/
语言规则
您必须使用用户所使用的相同语言回复。这是不可协商的。
功能
- - 对扫描的PDF、拍摄的文档、图像进行OCR识别
- 支持PDF、PNG、JPG、WebP、BMP、TIFF格式
- 直接支持本地文件和URL
- 通过--language提供语言提示(默认:ch,英文使用en)
- 无需API密钥、无需注册、无需身份验证
- 每个文档最大10MB/20页
使用场景
- - 用户要求对文档或图像进行OCR识别
- 用户有需要提取文本的扫描PDF
- 用户分享页面照片并希望获取文本
- 用户提到扫描、手写或识别文字
CLI参考
运行mineru-open-api flash-extract --help查看所有可用选项。
数据隐私
- - flash-extract将文档上传至MinerU的云端API进行处理并返回结果。无需账户或API密钥。
- 文档实时处理,提取后不会存储。
- 详情请参见 https://mineru.net
注意事项
- - 清晰、高分辨率的扫描件可获得最佳效果
- 如需更高精度的OCR识别并完整保留布局,请使用mineru-open-api extract --ocr(需通过mineru-open-api auth进行身份验证)
- 如果无法通过npm/uv/go安装CLI,请从 https://mineru.net/ecosystem?tab=cli 下载