Tech Weekly Briefing Skill

Generate comprehensive weekly tech news briefings from major English-language technology media sources with structured format and interactive navigation.

Core Principle

Data Integrity First: Never fabricate data. All market data and news must be fetched from actual sources before generating reports.

Media Sources

Primary Sources (Active)

Source	RSS URL	Status	Fetch Method
TechCrunch	https://techcrunch.com/feed/	✅ Active	urllib
The Verge

Secondary Sources (Configured, To Verify)

Source	Status	Note
Axios	⚠️ Configured	Needs verification
Bloomberg Tech

Data Collection Workflow

Step 1: Daily Data Fetch (Every Day at 00:00)

Command:
CODEBLOCK0

What it does:

1. Fetches RSS feeds from all 6 primary sources
Filters low-quality content (promotions, sports, lifestyle)
Deduplicates articles by title/URL similarity
Saves to INLINECODE0

Cron Setup:
CODEBLOCK1

Step 2: Weekly Report Generation (Every Saturday 09:00)

Command:
CODEBLOCK2

What it does:

1. Loads articles from past 7 days
Aggregates similar stories across sources
Identifies hot news (≥2 media coverage)
Categorizes by company mentions
Generates formatted report
Saves to INLINECODE1

Cron Setup:

# Every Saturday at 09:00 Beijing Time
0 9 * * 6 cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly >> /tmp/tech-weekly-cron.log 2>&1

Report Format Specification

Structure (4 Sections)

CODEBLOCK4

Section 1: 概览 / Overview

Language: Chinese only
Content: One paragraph summarizing ALL hot news stories
Format:
CODEBLOCK5

Section 2: 🔥 热点新闻 / Hot News

Criteria: Stories covered by ≥2 media outlets
Language: English titles only (no Chinese translation in body)
Format:
CODEBLOCK6

Requirements:

- All source links must be clickable
Listed by coverage count (descending)
Maximum 20 hot news items

Section 3: 🚗 Robotaxi Weekly

Scope: All autonomous driving news (not just ≥2 coverage)
Keywords: robotaxi, waymo, zoox, aurora, cruise, autonomous, self-driving
Format:
CODEBLOCK7

Section 4: Inline Buttons

Layout:
CODEBLOCK8

Companies Tracked:

- OpenAI, Anthropic, Google, Apple, Microsoft, Amazon, Meta
Tesla, NVIDIA, Waymo, Zoox, Aurora, Cruise
Nintendo, Sony, Netflix, Block, Robinhood

Content Validation Rules

Rule 1: No Fabricated Data

MUST:

- Execute fetch commands before generating reports
Use actual fetched data only
Mark missing data as [获取失败] if command fails

NEVER:

- Fill in prices/percentages without executing commands
Use cached data as real-time without timestamp
Guess or estimate any numeric values

Rule 2: Accurate Source Attribution

MUST:

- Every news item must have source link(s)
Multi-source stories must list ALL sources
Use actual URLs from RSS, not generated links

Rule 3: Deduplication

Algorithm:

1. Jaccard similarity ≥ 18% on title keywords
OR exact 3+ consecutive word match
Remove duplicates, keep earliest published

Rule 4: Low-Quality Filtering

Filtered Content:

- Sports: "baseball game", "championship", "tournament"
Promotions: "promo code", "% off", "coupon", "discount"
Shopping: "mattress firm", "kitchenaid promo", "norton coupon"
Lifestyle: "bird-watchers", "brew coffee", "sleep week deals"

Logged: All filtered items printed during daily fetch

Manual Operations

Check Data Collection Status

CODEBLOCK9

Verify Media Source

CODEBLOCK10

Force Re-fetch Today

CODEBLOCK11

Generate Test Report

CODEBLOCK12

Troubleshooting

The Information Returns 403

Cause: Python urllib blocked, curl works
Solution: Script automatically uses curl subprocess for The Information
Verify:
CODEBLOCK13

No Articles Found

Check:

1. Data directory: INLINECODE3
Last fetch: INLINECODE4
RSS status: INLINECODE5

Duplicate Articles in Report

Cause: Similarity threshold too low or high
Adjust: Edit is_same_news() function in INLINECODE7

Missing Company in Buttons

Add Company:

1. Edit COMPANY_KEYWORDS in INLINECODE9
Re-run weekly report generation

File Structure

CODEBLOCK14

Dependencies

- Python 3.10+
INLINECODE10 CLI for RSS monitoring
INLINECODE11 for The Information feed
Standard libraries: json, re, urllib, subprocess, datetime

Key Performance Indicators

Metric	Target
Daily fetch success rate	≥95% (6/6 sources)
Article deduplication accuracy

Version History

Version	Date	Changes
1.0.0	2026-03-09	Initial release with 6 sources, bilingual format, company buttons

Usage Examples

User Request: "生成科技周报"

CODEBLOCK15

User Request: "查看OpenAI新闻"

CODEBLOCK16

User Request: "添加新公司追踪"

CODEBLOCK17

Last Updated: 2026-03-09
Maintainer: OpenClaw Agent
Status: Production Ready

科技周报技能

从主要英文科技媒体来源生成结构化的每周科技新闻简报，并配备交互式导航。

核心原则

数据完整性优先：绝不编造数据。所有市场数据和新闻必须在生成报告前从实际来源获取。

媒体来源

主要来源（活跃）

来源	RSS链接	状态	获取方式
TechCrunch	https://techcrunch.com/feed/	✅ 活跃	urllib
The Verge

次要来源（已配置，待验证）

来源	状态	备注
Axios	⚠️ 已配置	需要验证
Bloomberg Tech

数据收集工作流

步骤1：每日数据获取（每天00:00）

命令：
bash
cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py daily

功能说明：

1. 从全部6个主要来源获取RSS订阅源
过滤低质量内容（推广、体育、生活类）
按标题/URL相似度去重
保存至 data/articles_YYYY-MM-DD.json

Cron设置：
bash

添加到crontab

crontab -e

添加以下行以在每天00:00执行获取：

0 0 * cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py daily >> /tmp/tech-weekly-cron.log 2>&1

步骤2：周报生成（每周六09:00）

命令：
bash
cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly

功能说明：

1. 加载过去7天的文章
跨来源聚合相似报道
识别热点新闻（≥2家媒体报道）
按提及公司分类
生成格式化报告
保存至 /tmp/tech-weekly-briefing-YYYYMMDD.txt

Cron设置：
bash

每周六北京时间09:00

0 9 6 cd ~/.openclaw/workspace-group/skills/tech-weekly-briefing && python3 scripts/generate-briefing.py weekly >> /tmp/tech-weekly-cron.log 2>&1

报告格式规范

结构（4个部分）

📊 外媒科技周报 | YYYY-MM-DD

1️⃣ 概览 (Overview) - 仅中文
2️⃣ 🔥 热点新闻 (Hot News) - 英文标题，所有来源链接
3️⃣ 🚗 Robotaxi Weekly - 所有自动驾驶新闻
4️⃣ [内联按钮] - 公司分类

第一部分：概览

语言： 仅中文
内容： 一段总结所有热点新闻的段落
格式：

📈 概览
本周扫描X家科技媒体，获取X篇文章，聚类为X条独特新闻。
X条热点新闻被≥2家媒体报道：[①新闻摘要；②新闻摘要；③新闻摘要；④新闻摘要]

第二部分：🔥 热点新闻

标准： 被≥2家媒体报道的新闻
语言： 仅英文标题（正文不翻译中文）
格式：

🔥 热点新闻（按媒体报道数倒序）

1️⃣ [英文标题]
📰 X家：
• Source1: https://link1
• Source2: https://link2
• Source3: https://link3

2️⃣ [英文标题]
📰 X家：
• Source1: https://link1
• Source2: https://link2

要求：

- 所有来源链接必须可点击
按报道数量降序排列
最多20条热点新闻

第三部分：🚗 Robotaxi Weekly

范围： 所有自动驾驶新闻（不限于≥2家报道）
关键词： robotaxi, waymo, zoox, aurora, cruise, autonomous, self-driving
格式：

🚗 Robotaxi Weekly / 自动驾驶一周汇总

1. [标题]

📰 Source | 🔗 https://link

2. [标题]

📰 Source | 🔗 https://link

第四部分：内联按钮

布局：

[🔴 OpenAI (X篇)] [🟣 Anthropic (X篇)]
[🔵 Google (X篇)] [🍎 Apple (X篇)]
[🟢 NVIDIA (X篇)] [🚗 Waymo]
[📋 查看全部]

追踪公司：

- OpenAI, Anthropic, Google, Apple, Microsoft, Amazon, Meta
Tesla, NVIDIA, Waymo, Zoox, Aurora, Cruise
Nintendo, Sony, Netflix, Block, Robinhood

内容验证规则

规则1：禁止编造数据

必须：

- 在生成报告前执行获取命令
仅使用实际获取的数据
如果命令失败，将缺失数据标记为[获取失败]

禁止：

- 未执行命令就填写价格/百分比
使用缓存数据作为实时数据而不标注时间戳
猜测或估算任何数值

规则2：准确标注来源

必须：

- 每条新闻必须有来源链接
多来源新闻必须列出所有来源
使用RSS中的实际URL，而非生成的链接

规则3：去重

算法：

1. 标题关键词Jaccard相似度≥18%
或精确匹配3个以上连续单词
移除重复项，保留最早发布的

规则4：低质量内容过滤

过滤内容：

- 体育类：棒球比赛、锦标赛、赛事
推广类：促销码、折扣%、优惠券、打折
购物类：床垫品牌、厨房电器促销、诺顿优惠券
生活类：观鸟者、冲泡咖啡、睡眠周特惠

记录： 所有过滤项在每日获取时打印

手动操作

检查数据收集状态

bash

查看今日收集的文章

ls -la ~/.openclaw/workspace-group/skills/tech-weekly-briefing/data/

检查文章数量

python3 -c import json; data=json.load(open(data/articles_$(date +%Y-%m-%d).json)); print(f{len(data)} articles today)

验证媒体来源

bash

测试RSS可访问性

curl -s https://techcrunch.com/feed/ | head -5
curl -s -A Mozilla/5.0 https://www.theinformation.com/feed | head -5

强制重新获取今日数据

bash
rm ~/.openclaw/workspace-group/skills/tech-weekly-briefing/data/articles_$(date +%Y-%m-%d).json
python3 scripts/generate-briefing.py daily

生成测试报告

bash

使用现有数据生成报告

python3 scripts/generate-briefing.py weekly

查看输出

cat /tmp/tech-weekly-briefing-$(date +%Y%m%d).txt

故障排除

The Information返回403错误

原因： Python urllib被屏蔽，curl可用
解决方案： 脚本自动使用curl子进程获取The Information
验证：
bash
curl -s -A Mozilla/5.0 https://www.theinformation.com/feed | head -10

未找到文章

检查：

1. 数据目录：ls data/
上次获取：cat /tmp/tech-weekly-cron.log
RSS状态：blogwatcher blogs

报告中出现重复文章

原因： 相似度阈值过低或过高
调整：

tech-weekly-briefing科技周报摘要