📝 Doc Summarize Pro — Document Analysis Toolkit
Pure-bash document summarizer: extract summaries, keywords, outlines, stats, and more — no external dependencies.
Commands
summarize <file>
Generate a document summary by extracting key sentences from each paragraph (first and last sentences, plus topic sentences).
keywords <file>
Extract keywords via word-frequency analysis. Filters common stop-words and ranks by occurrence count.
outline <file>
Extract document structure and outline by detecting heading lines (Markdown # headers, ALL-CAPS lines, numbered sections).
stats <file>
Document statistics: word count, character count, paragraph count, sentence count, unique words, and estimated reading time.
compare <file1> <file2>
Compare two documents side-by-side: word count difference, shared keywords, and unique keywords per file.
batch <dir>
Batch-summarize all text files in a directory. Processes .txt, .md, .rst, .log files and outputs a summary for each.
export <file> <format>
Export a file's summary in a specified format. Supported formats: md (Markdown), txt (plain text), json.
history
Display processing history — shows all previously run commands with timestamps.
config
View or update configuration. Settings: summary_sentences (sentences per paragraph in summary), keyword_count (max keywords to display).
help
Show usage information and available commands.
version
Print the current version number.
Examples
CODEBLOCK0
Configuration
Settings are stored in $HOME/.doc-summarize-pro/config:
| Key | Default | Description |
|---|
| INLINECODE22 | INLINECODE23 | Sentences extracted per paragraph |
| INLINECODE24 |
15 | Maximum keywords to display |
Update via config <key> <value> or edit the config file directly.
Data Storage
All data is stored under $HOME/.doc-summarize-pro/:
| File | Purpose |
|---|
| INLINECODE28 | Key-value configuration file |
| INLINECODE29 |
Processing history with timestamps |
Powered by BytesAgain | bytesagain.com
📝 Doc Summarize Pro — 文档分析工具包
纯Bash文档摘要工具:提取摘要、关键词、大纲、统计信息等——无需外部依赖。
命令
summarize <文件>
通过提取每个段落的关键句子(首句、尾句及主题句)生成文档摘要。
keywords <文件>
通过词频分析提取关键词。过滤常见停用词并按出现次数排序。
outline <文件>
通过检测标题行(Markdown # 标题、全大写行、编号章节)提取文档结构和大纲。
stats <文件>
文档统计信息:字数、字符数、段落数、句子数、唯一词汇数及预估阅读时间。
compare <文件1> <文件2>
并排比较两个文档:字数差异、共享关键词及各文件独有关键词。
batch <目录>
批量摘要目录中的所有文本文件。处理 .txt、.md、.rst、.log 文件并为每个文件输出摘要。
export <文件> <格式>
以指定格式导出文件摘要。支持的格式:md(Markdown)、txt(纯文本)、json。
history
显示处理历史——展示所有先前运行的命令及时间戳。
config
查看或更新配置。设置项:summarysentences(摘要中每段提取的句子数)、keywordcount(显示的最大关键词数)。
help
显示使用信息和可用命令。
version
打印当前版本号。
示例
bash
摘要文档
bash scripts/script.sh summarize ~/Documents/report.md
从文件提取关键词
bash scripts/script.sh keywords paper.txt
获取文档大纲
bash scripts/script.sh outline thesis.md
显示文件统计信息
bash scripts/script.sh stats notes.txt
比较两个文档
bash scripts/script.sh compare draft-v1.md draft-v2.md
批量摘要目录
bash scripts/script.sh batch ~/Documents/notes/
导出为JSON格式摘要
bash scripts/script.sh export report.md json
查看处理历史
bash scripts/script.sh history
查看/更新配置
bash scripts/script.sh config
bash scripts/script.sh config summary_sentences 3
bash scripts/script.sh config keyword_count 20
配置
设置存储在 $HOME/.doc-summarize-pro/config 中:
| 键名 | 默认值 | 描述 |
|---|
| summarysentences | 2 | 每段提取的句子数 |
| keywordcount |
15 | 显示的最大关键词数 |
通过 config <键> <值> 更新或直接编辑配置文件。
数据存储
所有数据存储在 $HOME/.doc-summarize-pro/ 下:
| 文件 | 用途 |
|---|
| config | 键值对配置文件 |
| history.log |
带时间戳的处理历史记录 |
由 BytesAgain 提供支持 | bytesagain.com