WeChat Article Reader
Extract full article content from mp.weixin.qq.com URLs.
When to Use
- - User shares a WeChat article link (
mp.weixin.qq.com/s/xxx) - Need to read/summarize/analyze/archive a WeChat article
- ContentPipe Scout node receives a WeChat URL for reference
Quick Start
CODEBLOCK0
How It Works
WeChat articles are JS-rendered — HTTP requests only get an empty shell. This skill uses Playwright headless Chromium to:
- 1. Launch headless browser with anti-detection flags
- Navigate to the WeChat URL, wait for INLINECODE2
- Wait for
#js_content (article body container) - Extract title (
h1#activity-name), author, time, body text - Clean HTML → plain text (strip scripts/styles, compress whitespace)
- Return structured JSON
Fallback: Mirror Search
If Playwright is unavailable, the skill searches Chinese content aggregators (53ai.com, 36kr.com, juejin.cn, woshipm.com) for mirror copies of the article.
Python API
CODEBLOCK1
Limitations
- - Requires one-time Chromium install (
python3 scripts/setup.py) - First fetch takes ~5-10s (browser startup); subsequent fetches ~3-5s (browser reuse)
- Cannot bypass WeChat login walls (paid content, follower-only articles)
- Mirror fallback only works for popular/widely-shared articles
微信公众号文章阅读器
从 mp.weixin.qq.com 链接中提取完整的文章内容。
使用场景
- - 用户分享微信公众号文章链接(mp.weixin.qq.com/s/xxx)
- 需要阅读/总结/分析/归档微信公众号文章
- ContentPipe Scout 节点接收到需要参考的微信公众号链接
快速开始
bash
首次安装(安装无头 Chromium 浏览器,约 200MB)
python3 SKILL_DIR/scripts/setup.py
提取文章
python3 SKILL
DIR/scripts/fetcharticle.py https://mp.weixin.qq.com/s/xxx
输出:包含标题、作者、发布时间、正文内容、字数的 JSON 数据
工作原理
微信公众号文章由 JavaScript 渲染生成——仅通过 HTTP 请求只能获取空壳页面。本技能使用 Playwright 无头 Chromium 浏览器:
- 1. 启动带有反检测标志的无头浏览器
- 导航至微信公众号链接,等待 networkidle 状态
- 等待 #js_content(文章正文容器)加载
- 提取标题(h1#activity-name)、作者、发布时间、正文文本
- 清理 HTML → 纯文本(移除脚本/样式,压缩空白字符)
- 返回结构化的 JSON 数据
备用方案:镜像搜索
如果 Playwright 不可用,本技能会搜索中文内容聚合平台(53ai.com、36kr.com、掘金、人人都是产品经理)上的文章镜像副本。
Python API
python
from fetcharticle import fetchwechat_article
result = fetchwechatarticle(https://mp.weixin.qq.com/s/xxx)
result = {
success: True,
title: 文章标题,
author: 作者名,
publish_time: 2026-03-10,
content: 正文全文...,
word_count: 2500,
source: playwright, # 或 mirror
url: https://mp.weixin.qq.com/s/xxx
}
局限性
- - 需要一次性安装 Chromium(python3 scripts/setup.py)
- 首次获取约需 5-10 秒(浏览器启动);后续获取约需 3-5 秒(浏览器复用)
- 无法绕过微信公众号登录墙(付费内容、仅限关注者阅读的文章)
- 镜像备用方案仅适用于热门/广泛传播的文章