Bluesky Scraper
Scrape Bluesky posts using an Apify Actor via the REST API.
Actor ID
INLINECODE0
Prerequisites
- -
APIFY_TOKEN environment variable must be set - INLINECODE2 and
jq must be available
Workflow
Step 1: Confirm search parameters with user
Ask what they want to search for. Supported input fields:
- -
searchTerms (array of strings) - keywords to search - INLINECODE5 (integer) - max posts to return (default: 50)
- INLINECODE6 (string) - "relevance" or "latest"
Step 2: Run the Actor (synchronous)
CODEBLOCK0
For larger jobs (async):
CODEBLOCK1
Step 3: Poll and fetch (if async)
CODEBLOCK2
Step 4: Present results
Summarize: total posts, top by engagement, common themes. Offer JSON/CSV export.
Error Handling
- - If APIFY_TOKEN not set: INLINECODE7
- If run FAILS: INLINECODE8
- Rate limited (429): wait 60s, retry
Bluesky 爬虫
通过 REST API 使用 Apify Actor 抓取 Bluesky 帖子。
Actor ID
WAJfBnZBYR9mJrk5d
前置条件
- - 必须设置 APIFY_TOKEN 环境变量
- 必须安装 curl 和 jq
工作流程
步骤 1:与用户确认搜索参数
询问用户想要搜索什么。支持的输入字段:
- - searchTerms(字符串数组)- 要搜索的关键词
- maxResults(整数)- 返回的最大帖子数(默认:50)
- sortBy(字符串)- relevance(相关性)或 latest(最新)
步骤 2:运行 Actor(同步)
bash
RESULT=$(curl -s -X POST https://api.apify.com/v2/acts/WAJfBnZBYR9mJrk5d/run-sync-get-dataset-items?token=$APIFY_TOKEN \
-H Content-Type: application/json \
-d {searchTerms: [SEARCH_TERM], maxResults: 50, sortBy: relevance})
echo $RESULT | jq .
对于较大任务(异步):
bash
RUNID=$(curl -s -X POST https://api.apify.com/v2/acts/WAJfBnZBYR9mJrk5d/runs?token=$APIFYTOKEN \
-H Content-Type: application/json \
-d {searchTerms: [TERM], maxResults: 50} | jq -r .data.id)
步骤 3:轮询并获取结果(如果异步)
bash
STATUS=$(curl -s https://api.apify.com/v2/actor-runs/$RUN
ID?token=$APIFYTOKEN | jq -r .data.status)
每 5 秒轮询一次,直到 SUCCEEDED 或 FAILED
curl -s https://api.apify.com/v2/actor-runs/$RUN
ID/dataset/items?token=$APIFYTOKEN | jq .
步骤 4:呈现结果
总结:总帖子数、互动量最高的帖子、常见主题。提供 JSON/CSV 导出。
错误处理
- - 如果未设置 APIFYTOKEN:export APIFYTOKEN=yourtoken
- 如果运行失败:curl -s https://api.apify.com/v2/actor-runs/$RUNID/log?token=$APIFY_TOKEN
- 遇到速率限制(429):等待 60 秒后重试