Hacker News Scraper
Scrape Hacker News using an Apify Actor via the REST API.
Actor ID
INLINECODE0
Prerequisites
- -
APIFY_TOKEN environment variable must be set - INLINECODE2 and
jq must be available
Workflow
Step 1: Confirm parameters with user
Ask what they want to scrape. Supported input fields:
- -
searchTerms (array of strings) - keywords to search - INLINECODE5 (integer) - max stories to return
- INLINECODE6 (string) - "points", "date", or "relevance"
- INLINECODE7 (boolean) - include comment threads
Step 2: Run the Actor
CODEBLOCK0
Step 3: Poll and fetch (if async)
CODEBLOCK1
Step 4: Present results
Summarize: top stories by points, comment counts, domains, trends. Offer JSON/CSV export.
Error Handling
- - If APIFY_TOKEN not set: INLINECODE8
- If run FAILS: check log endpoint
Hacker News 抓取工具
通过 REST API 使用 Apify Actor 抓取 Hacker News。
Actor ID
0UDODOnpTkxY3Oc90
前置条件
- - 必须设置 APIFY_TOKEN 环境变量
- 必须安装 curl 和 jq
工作流程
步骤 1:与用户确认参数
询问用户想要抓取的内容。支持的输入字段:
- - searchTerms(字符串数组)- 搜索关键词
- maxResults(整数)- 返回的最大故事数量
- sortBy(字符串)- points(按点赞数)、date(按日期)或 relevance(按相关性)
- includeComments(布尔值)- 是否包含评论线程
步骤 2:运行 Actor
bash
RESULT=$(curl -s -X POST https://api.apify.com/v2/acts/0UDODOnpTkxY3Oc90/run-sync-get-dataset-items?token=$APIFY_TOKEN \
-H Content-Type: application/json \
-d {searchTerms: [TERM], maxResults: 30})
echo $RESULT | jq .
步骤 3:轮询并获取(异步模式)
bash
RUN
ID=$(curl -s -X POST https://api.apify.com/v2/acts/0UDODOnpTkxY3Oc90/runs?token=$APIFYTOKEN \
-H Content-Type: application/json \
-d {searchTerms: [TERM], maxResults: 100} | jq -r .data.id)
curl -s https://api.apify.com/v2/actor-runs/$RUN
ID?token=$APIFYTOKEN | jq -r .data.status
curl -s https://api.apify.com/v2/actor-runs/$RUN
ID/dataset/items?token=$APIFYTOKEN | jq .
步骤 4:呈现结果
总结:按点赞数、评论数、域名、趋势排列的热门故事。提供 JSON/CSV 导出功能。
错误处理
- - 如果未设置 APIFYTOKEN:export APIFYTOKEN=your_token
- 如果运行失败:检查日志端点