Parse, extract, and analyze documents using the LlamaParse API (LlamaCloud). Use when the user asks to parse PDFs, images, spreadsheets, or other documents into markdown/text/structured data, extract tables or charts from documents, do OCR on scans, batch-process a folder of files, or use LlamaParse for any document processing task. Triggers on phrases like "parse this PDF", "extract text from document", "OCR this scan", "convert PDF to markdown", "extract tables", "parse with LlamaParse", "llam
使用LlamaParse API将文档(PDF、图片、电子表格、演示文稿等130+格式)解析为适用于LLM的文本、Markdown和结构化数据。
验证配置:
bash
pip install llama-cloud>=1.0
export LLAMACLOUDAPI_KEY=llx-...
python
from llama_cloud import AsyncLlamaCloud
import asyncio
async def parsedocument(filepath: str):
client = AsyncLlamaCloud() # 使用LLAMACLOUDAPI_KEY环境变量
file = await client.files.create(file=file_path, purpose=parse)
result = await client.parsing.parse(
file_id=file.id,
tier=agentic,
version=latest,
expand=[markdown, text],
)
return result
result = asyncio.run(parse_document(document.pdf))
print(result.markdown.pages[0].markdown)
| 层级 | 使用场景 | 成本 |
|---|---|---|
| agentic_plus | 最高精度,复杂布局,图表 | 最高 |
| agentic |
始终同时指定tier和version。开发环境使用version=latest,生产环境可复现性使用日期字符串如2026-01-08。
在expand列表中请求一个或多个:
访问结果:result.markdown.pages[i].markdown、result.text.pages[i].text、result.items.pages[i].items
控制Markdown渲染:
python
output_options={
markdown: {
tables: {
outputtablesas_markdown: True, # 或False使用HTML表格
},
},
imagestosave: [screenshot], # 保存页面截图
}
python
processing_options={
ignore: {ignorediagonaltext: True},
ocr_parameters: {languages: [en]}, # OCR语言提示
specializedchartparsing: agentic_plus, # 将图表提取为结构化数据
}
像指导LLM一样引导解析器——适用于提取特定数据或转换输出:
python
from llamacloud.types.parsingcreate_params import (
ProcessingOptions, ProcessingOptionsAutoModeConfiguration,
ProcessingOptionsAutoModeConfigurationParsingConf
)
result = await client.parsing.parse(
file_id=file.id,
tier=agentic,
version=latest,
expand=[markdown],
processing_options=ProcessingOptions(
automodeconfiguration=[ProcessingOptionsAutoModeConfiguration(
parsing_conf=ProcessingOptionsAutoModeConfigurationParsingConf(
custom_prompt=仅从该收据中提取价格和总额。
)
)]
),
)
使用scripts/parse_document.py:
bash
python scripts/parse_document.py document.pdf --tier agentic --output markdown,text
使用scripts/batch_parse.py:
bash
python scripts/batch_parse.py ./documents/ --tier agentic --max-concurrent 5
在expand中请求items,然后过滤表格项:
python
for page in result.items.pages:
for item in page.items:
if hasattr(item, rows): # 表格项
print(f第{page.page_number}页的表格:{len(item.rows)}行)
# 可使用item.csv、item.html、item.md
启用专门的图表解析,然后从图表页面提取表格行:
python
result = await client.parsing.parse(
file_id=file.id,
tier=agentic_plus,
version=latest,
processingoptions={specializedchartparsing: agenticplus},
expand=[items],
)
python
import httpx, re
result = await client.parsing.parse(
file_id=file.id, tier=agentic, version=latest,
outputoptions={imagesto_save: [screenshot]},
expand=[imagescontentmetadata],
)
for img in result.imagescontentmetadata.images:
if img.presignedurl and re.match(r^page\d+\.jpg$, img.filename):
async with httpx.AsyncClient() as http:
resp = await http.get(img.presigned_url)
with open(img.filename, wb) as f:
f.write(resp.content)
完整API详情请参阅references/api-reference.md。
本技能使用LlamaParse API(https://cloud.llamaindex.ai),这是LlamaIndex提供的云端文档解析服务。
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 llamaparse-1776115632 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 llamaparse-1776115632 技能
skillhub install llamaparse-1776115632
文件大小: 10.31 KB | 发布时间: 2026-4-15 13:15