Chinese Ebook Downloader

Download Chinese ebooks from multiple sources with automatic fallback and format conversion.

Quick Start

CODEBLOCK0

Download Sources (Priority Order)

Source	Coverage	Limit	Notes
Source A (online book library)	~100%	None	Primary — high coverage for popular Chinese books
Source B (secondary library)

Note: Z-Library has been deprecated due to 10/day download limit.

Multi-Source Fallback

The multi_source_download.py script automatically tries sources in order:

CODEBLOCK1

Workflow per book:

1. Try Source A (ZIP → extract PDF/EPUB)
If failed, try Source B (file host download)
If failed, try Source C (Anna's Archive via libgen.li)
If only EPUB found, auto-convert to PDF using weasyprint

Usage:
CODEBLOCK2

EPUB → PDF Conversion

When only EPUB format is available, auto-convert using weasyprint:

CODEBLOCK3

Requirements: ebooklib, weasyprint, CJK fonts installed.

Scripts Reference

Script	Purpose
INLINECODE3	Primary download from Source A
INLINECODE4

Source A Workflow (Primary)

CODEBLOCK4

Step 1: Search

Search the primary library for the book title. Navigate to download page, extract file host URL and password.

Step 2: Decrypt

Navigate to file host URL, enter password, click decrypt.

Step 3: Wait for countdown

File hosting service requires countdown before download. Do not skip.

Step 4: Fetch real download URL

Get page variables:
CODEBLOCK5

Call API:
CODEBLOCK6

Response code: 200 → downurl is real URL.

Step 5: Download

CODEBLOCK7

Step 6: Extract ZIP (GBK encoding)

CODEBLOCK8

Book Name Matching Strategy

When a book title is long or contains multiple names (e.g. box sets):

- Removes subtitles (after "：" or ":")
Removes parenthetical content ("（...）", "(...)")
Removes "套装共X册" bundle descriptions
Splits "+"-connected titles into individual books
Tries each keyword until match found
Falls back to full title + author

Examples:

- "杨定一全部生命系列：真原医+静坐+好睡（套装3册）" → tries "真原医", "静坐", "好睡"
"超越百岁：长寿的科学与艺术" → tries "超越百岁", then "超越百岁彼得·阿提亚"

Format Selection

Flag	Description
INLINECODE12	PDF only (default, preferred for NotebookLM)
INLINECODE13

Batch Download

CODEBLOCK9

JSON format:
CODEBLOCK10

Features: resume via _progress.json, skip existing, rate limiting.

Troubleshooting

Problem	Solution
IP blocking	Use browser tool, not web_fetch
Link 404

中文电子书下载器

从多个来源下载中文电子书，支持自动回退和格式转换。

快速开始

bash

单本书下载（多源回退）

python scripts/download_book.py --title 超越百岁 --author 彼得·阿提亚

多源批量下载（A→B→C回退 + EPUB→PDF转换）

python scripts/multisourcedownload.py ~/Books/

直接搜索安娜的档案

python scripts/searchsourcec.py 书名作者

将EPUB转换为PDF

python scripts/epubtopdf.py book.epub book.pdf

下载来源（优先级顺序）

来源	覆盖范围	限制	备注
来源A（在线图书库）	~100%	无	主要来源——热门中文书籍覆盖率高
来源B（辅助库）

~8% | 无 | 缺失书籍的回退方案 | | 来源C（安娜的档案） | 广泛 | 速率受限 | 最后手段——使用libgen.li镜像 |

注意： Z-Library因每日10本下载限制已弃用。

多源回退

multisourcedownload.py脚本自动按顺序尝试来源：

来源A → 来源B → 来源C → EPUB→PDF转换

每本书的工作流程：

1. 尝试来源A（ZIP → 提取PDF/EPUB）
若失败，尝试来源B（文件托管下载）
若失败，尝试来源C（通过libgen.li访问安娜的档案）
若仅找到EPUB，使用weasyprint自动转换为PDF

使用方法：
bash

编辑脚本中的BOOKS列表，然后运行：

python scripts/multisourcedownload.py ~/Books/

EPUB → PDF转换

当仅有EPUB格式可用时，使用weasyprint自动转换：

bash

单个文件

python scripts/epubtopdf.py input.epub output.pdf

批量转换目录

python scripts/epubtopdf.py --batch ~/Books/

依赖要求： ebooklib、weasyprint、已安装CJK字体。

脚本参考

脚本	用途
downloadbook.py	从来源A主下载
searchsecondary_source.py

来源A工作流程（主要）

搜索 → 获取文件托管链接 → 解密 → 等待倒计时 → API获取 → curl下载 → 解压ZIP

步骤1：搜索

在主库中搜索书名。导航至下载页面，提取文件托管URL和密码。

步骤2：解密

导航至文件托管URL，输入密码，点击解密。

步骤3：等待倒计时

文件托管服务需要倒计时才能下载。请勿跳过。

步骤4：获取真实下载URL

获取页面变量：
javascript
JSON.stringify({apiserver, userid, fileid, shareid, filechk, starttime, waitseconds, verifycode})

调用API：
javascript
(async () => {
var url = apiserver + /getfile_url.php?uid= + userid
+ &fid= + fileid + &folderid=0&shareid= + shareid
+ &filechk= + filechk + &starttime= + starttime
+ &waitseconds= + waitseconds + &mb=0&app=0&acheck=0
+ &verifycode= + verifycode + &rd= + Math.random();
var headers = typeof getAjaxHeaders === function ? getAjaxHeaders() : {};
var resp = await fetch(url, {headers: headers});
return JSON.stringify(await resp.json());
})()

响应code: 200 → downurl为真实URL。

步骤5：下载

bash curl -L -o book.zip DOWNURL \ -H User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10157) \ --max-time 1200

步骤6：解压ZIP（GBK编码）

python import zipfile with zipfile.ZipFile(book.zip, r) as z: for info in z.infolist(): try: name = info.filename.encode(cp437).decode(gbk) except: name = info.filename ext = os.path.splitext(name)[1].lower() if ext in (.epub, .azw3, .mobi, .pdf, .txt): data = z.read(info.filename) with open(os.path.basename(name), wb) as f: f.write(data)

书名匹配策略

当书名较长或包含多个名称时（例如套装）：

- 移除副标题（：或:之后的内容）
移除括号内容（（...）、(...)）
移除套装共X册套装描述
将+连接的标题拆分为单本书
依次尝试每个关键词直至匹配成功
回退至完整标题+作者

示例：

- 杨定一全部生命系列：真原医+静坐+好睡（套装3册） → 尝试真原医、静坐、好睡
超越百岁：长寿的科学与艺术 → 尝试超越百岁，然后超越百岁彼得·阿提亚

格式选择

标志	描述
--format pdf	仅PDF（默认，NotebookLM首选）
--format epub

批量下载

bash
python scripts/batch_download.py --book-list books.json --output-dir ~/Books/

JSON格式：
json
[
{title: 超越百岁, file_url: <文件托管URL>, password: <密码>}
]

功能：通过_progress.json断点续传、跳过已有文件、速率限制。

故障排除

问题	解决方案
IP被封锁	使用浏览器工具，而非web_fetch
链接404

chinese-ebook-downloader中文电子书下载器