Tech Weekly Briefing Skill
Generate comprehensive weekly tech news briefings from major English-language technology media sources with structured format and interactive navigation.
Core Principle
Data Integrity First: Never fabricate data. All market data and news must be fetched from actual sources before generating reports.
Media Sources
Primary Sources (Active)
| Source | RSS URL | Status | Fetch Method |
|---|
| TechCrunch | https://techcrunch.com/feed/ | ✅ Active | urllib |
| The Verge |
https://www.theverge.com/rss/index.xml | ✅ Active | urllib |
|
Wired | https://www.wired.com/feed/rss | ✅ Active | urllib |
|
Ars Technica | https://arstechnica.com/feed/ | ✅ Active | urllib |
|
MIT Technology Review | https://www.technologyreview.com/feed/ | ✅ Active | urllib |
|
The Information | https://www.theinformation.com/feed | ✅ Active | curl (bypasses 403) |
Secondary Sources (Configured, To Verify)
| Source | Status | Note |
|---|
| Axios | ⚠️ Configured | Needs verification |
| Bloomberg Tech |
⚠️ Configured | May have paywall |
| Reuters Tech | ⚠️ Configured | Needs verification |
| WSJ Tech | ⚠️ Configured | Paywall likely |
Data Collection Workflow
Step 1: Daily Data Fetch (Every Day at 00:00)
Command:
CODEBLOCK0
What it does:
- 1. Fetches RSS feeds from all 6 primary sources
- Filters low-quality content (promotions, sports, lifestyle)
- Deduplicates articles by title/URL similarity
- Saves to INLINECODE0
Cron Setup:
CODEBLOCK1
Step 2: Weekly Report Generation (Every Saturday 09:00)
Command:
CODEBLOCK2
What it does:
- 1. Loads articles from past 7 days
- Aggregates similar stories across sources
- Identifies hot news (≥2 media coverage)
- Categorizes by company mentions
- Generates formatted report
- Saves to INLINECODE1
Cron Setup:
# Every Saturday at 09:00 Beijing Time
0 9 * * 6 cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly >> /tmp/tech-weekly-cron.log 2>&1
Report Format Specification
Structure (4 Sections)
CODEBLOCK4
Section 1: 概览 / Overview
Language: Chinese only
Content: One paragraph summarizing ALL hot news stories
Format:
CODEBLOCK5
Section 2: 🔥 热点新闻 / Hot News
Criteria: Stories covered by ≥2 media outlets
Language: English titles only (no Chinese translation in body)
Format:
CODEBLOCK6
Requirements:
- - All source links must be clickable
- Listed by coverage count (descending)
- Maximum 20 hot news items
Section 3: 🚗 Robotaxi Weekly
Scope: All autonomous driving news (not just ≥2 coverage)
Keywords: robotaxi, waymo, zoox, aurora, cruise, autonomous, self-driving
Format:
CODEBLOCK7
Section 4: Inline Buttons
Layout:
CODEBLOCK8
Companies Tracked:
- - OpenAI, Anthropic, Google, Apple, Microsoft, Amazon, Meta
- Tesla, NVIDIA, Waymo, Zoox, Aurora, Cruise
- Nintendo, Sony, Netflix, Block, Robinhood
Content Validation Rules
Rule 1: No Fabricated Data
MUST:
- - Execute fetch commands before generating reports
- Use actual fetched data only
- Mark missing data as
[获取失败] if command fails
NEVER:
- - Fill in prices/percentages without executing commands
- Use cached data as real-time without timestamp
- Guess or estimate any numeric values
Rule 2: Accurate Source Attribution
MUST:
- - Every news item must have source link(s)
- Multi-source stories must list ALL sources
- Use actual URLs from RSS, not generated links
Rule 3: Deduplication
Algorithm:
- 1. Jaccard similarity ≥ 18% on title keywords
- OR exact 3+ consecutive word match
- Remove duplicates, keep earliest published
Rule 4: Low-Quality Filtering
Filtered Content:
- - Sports: "baseball game", "championship", "tournament"
- Promotions: "promo code", "% off", "coupon", "discount"
- Shopping: "mattress firm", "kitchenaid promo", "norton coupon"
- Lifestyle: "bird-watchers", "brew coffee", "sleep week deals"
Logged: All filtered items printed during daily fetch
Manual Operations
Check Data Collection Status
CODEBLOCK9
Verify Media Source
CODEBLOCK10
Force Re-fetch Today
CODEBLOCK11
Generate Test Report
CODEBLOCK12
Troubleshooting
The Information Returns 403
Cause: Python urllib blocked, curl works
Solution: Script automatically uses curl subprocess for The Information
Verify:
CODEBLOCK13
No Articles Found
Check:
- 1. Data directory: INLINECODE3
- Last fetch: INLINECODE4
- RSS status: INLINECODE5
Duplicate Articles in Report
Cause: Similarity threshold too low or high
Adjust: Edit is_same_news() function in INLINECODE7
Missing Company in Buttons
Add Company:
- 1. Edit
COMPANY_KEYWORDS in INLINECODE9 - Re-run weekly report generation
File Structure
CODEBLOCK14
Dependencies
- - Python 3.10+
- INLINECODE10 CLI for RSS monitoring
- INLINECODE11 for The Information feed
- Standard libraries: json, re, urllib, subprocess, datetime
Key Performance Indicators
| Metric | Target |
|---|
| Daily fetch success rate | ≥95% (6/6 sources) |
| Article deduplication accuracy |
≥90% |
| Low-quality filter precision | ≥85% |
| Hot news detection (≥2 sources) | Capture all multi-source stories |
| Report generation time | <30 seconds |
Version History
| Version | Date | Changes |
|---|
| 1.0.0 | 2026-03-09 | Initial release with 6 sources, bilingual format, company buttons |
Usage Examples
User Request: "生成科技周报"
CODEBLOCK15
User Request: "查看OpenAI新闻"
CODEBLOCK16
User Request: "添加新公司追踪"
CODEBLOCK17
Last Updated: 2026-03-09
Maintainer: OpenClaw Agent
Status: Production Ready
科技周报技能
从主要英文科技媒体来源生成结构化的每周科技新闻简报,并配备交互式导航。
核心原则
数据完整性优先:绝不编造数据。所有市场数据和新闻必须在生成报告前从实际来源获取。
媒体来源
主要来源(活跃)
| 来源 | RSS链接 | 状态 | 获取方式 |
|---|
| TechCrunch | https://techcrunch.com/feed/ | ✅ 活跃 | urllib |
| The Verge |
https://www.theverge.com/rss/index.xml | ✅ 活跃 | urllib |
|
Wired | https://www.wired.com/feed/rss | ✅ 活跃 | urllib |
|
Ars Technica | https://arstechnica.com/feed/ | ✅ 活跃 | urllib |
|
MIT Technology Review | https://www.technologyreview.com/feed/ | ✅ 活跃 | urllib |
|
The Information | https://www.theinformation.com/feed | ✅ 活跃 | curl(绕过403错误) |
次要来源(已配置,待验证)
| 来源 | 状态 | 备注 |
|---|
| Axios | ⚠️ 已配置 | 需要验证 |
| Bloomberg Tech |
⚠️ 已配置 | 可能存在付费墙 |
| Reuters Tech | ⚠️ 已配置 | 需要验证 |
| WSJ Tech | ⚠️ 已配置 | 可能存在付费墙 |
数据收集工作流
步骤1:每日数据获取(每天00:00)
命令:
bash
cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py daily
功能说明:
- 1. 从全部6个主要来源获取RSS订阅源
- 过滤低质量内容(推广、体育、生活类)
- 按标题/URL相似度去重
- 保存至 data/articles_YYYY-MM-DD.json
Cron设置:
bash
添加到crontab
crontab -e
添加以下行以在每天00:00执行获取:
0 0
* cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py daily >> /tmp/tech-weekly-cron.log 2>&1
步骤2:周报生成(每周六09:00)
命令:
bash
cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly
功能说明:
- 1. 加载过去7天的文章
- 跨来源聚合相似报道
- 识别热点新闻(≥2家媒体报道)
- 按提及公司分类
- 生成格式化报告
- 保存至 /tmp/tech-weekly-briefing-YYYYMMDD.txt
Cron设置:
bash
每周六北京时间09:00
0 9
6 cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly >> /tmp/tech-weekly-cron.log 2>&1
报告格式规范
结构(4个部分)
📊 外媒科技周报 | YYYY-MM-DD
1️⃣ 概览 (Overview) - 仅中文
2️⃣ 🔥 热点新闻 (Hot News) - 英文标题,所有来源链接
3️⃣ 🚗 Robotaxi Weekly - 所有自动驾驶新闻
4️⃣ [内联按钮] - 公司分类
第一部分:概览
语言: 仅中文
内容: 一段总结所有热点新闻的段落
格式:
📈 概览
本周扫描X家科技媒体,获取X篇文章,聚类为X条独特新闻。
X条热点新闻被≥2家媒体报道:[①新闻摘要;②新闻摘要;③新闻摘要;④新闻摘要]
第二部分:🔥 热点新闻
标准: 被≥2家媒体报道的新闻
语言: 仅英文标题(正文不翻译中文)
格式:
🔥 热点新闻(按媒体报道数倒序)
1️⃣ [英文标题]
📰 X家:
• Source1: https://link1
• Source2: https://link2
• Source3: https://link3
2️⃣ [英文标题]
📰 X家:
• Source1: https://link1
• Source2: https://link2
要求:
- - 所有来源链接必须可点击
- 按报道数量降序排列
- 最多20条热点新闻
第三部分:🚗 Robotaxi Weekly
范围: 所有自动驾驶新闻(不限于≥2家报道)
关键词: robotaxi, waymo, zoox, aurora, cruise, autonomous, self-driving
格式:
🚗 Robotaxi Weekly / 自动驾驶一周汇总
- 1. [标题]
📰 Source | 🔗 https://link
- 2. [标题]
📰 Source | 🔗 https://link
第四部分:内联按钮
布局:
[🔴 OpenAI (X篇)] [🟣 Anthropic (X篇)]
[🔵 Google (X篇)] [🍎 Apple (X篇)]
[🟢 NVIDIA (X篇)] [🚗 Waymo]
[📋 查看全部]
追踪公司:
- - OpenAI, Anthropic, Google, Apple, Microsoft, Amazon, Meta
- Tesla, NVIDIA, Waymo, Zoox, Aurora, Cruise
- Nintendo, Sony, Netflix, Block, Robinhood
内容验证规则
规则1:禁止编造数据
必须:
- - 在生成报告前执行获取命令
- 仅使用实际获取的数据
- 如果命令失败,将缺失数据标记为[获取失败]
禁止:
- - 未执行命令就填写价格/百分比
- 使用缓存数据作为实时数据而不标注时间戳
- 猜测或估算任何数值
规则2:准确标注来源
必须:
- - 每条新闻必须有来源链接
- 多来源新闻必须列出所有来源
- 使用RSS中的实际URL,而非生成的链接
规则3:去重
算法:
- 1. 标题关键词Jaccard相似度≥18%
- 或精确匹配3个以上连续单词
- 移除重复项,保留最早发布的
规则4:低质量内容过滤
过滤内容:
- - 体育类:棒球比赛、锦标赛、赛事
- 推广类:促销码、折扣%、优惠券、打折
- 购物类:床垫品牌、厨房电器促销、诺顿优惠券
- 生活类:观鸟者、冲泡咖啡、睡眠周特惠
记录: 所有过滤项在每日获取时打印
手动操作
检查数据收集状态
bash
查看今日收集的文章
ls -la ~/.openclaw/workspace-group/skills/tech-weekly-briefing/data/
检查文章数量
python3 -c import json; data=json.load(open(data/articles_$(date +%Y-%m-%d).json)); print(f{len(data)} articles today)
验证媒体来源
bash
测试RSS可访问性
curl -s https://techcrunch.com/feed/ | head -5
curl -s -A Mozilla/5.0 https://www.theinformation.com/feed | head -5
强制重新获取今日数据
bash
rm ~/.openclaw/workspace-group/skills/tech-weekly-briefing/data/articles_$(date +%Y-%m-%d).json
python3 scripts/generate-briefing.py daily
生成测试报告
bash
使用现有数据生成报告
python3 scripts/generate-briefing.py weekly
查看输出
cat /tmp/tech-weekly-briefing-$(date +%Y%m%d).txt
故障排除
The Information返回403错误
原因: Python urllib被屏蔽,curl可用
解决方案: 脚本自动使用curl子进程获取The Information
验证:
bash
curl -s -A Mozilla/5.0 https://www.theinformation.com/feed | head -10
未找到文章
检查:
- 1. 数据目录:ls data/
- 上次获取:cat /tmp/tech-weekly-cron.log
- RSS状态:blogwatcher blogs
报告中出现重复文章
原因: 相似度阈值过低或过高
调整: