Apify Skill

Run any Apify Actor through a standardized workflow: search → validate → execute → collect results.

Prerequisites

- APIFY_TOKEN env var, or a config.json with tokens (copy config.json.example)
Python 3 with requests installed

Workflow

Step 1: Parse User Intent

Extract from the user's request:

- Platform/target (Instagram, TikTok, Reddit, etc.)
What to scrape (posts, profiles, hashtags, comments, etc.)
Targets (URLs, usernames, keywords)
Quantity/filters (how many, time range, min likes, etc.)

Step 2: Select Token

If user specifies a token name or the task maps to a specific account, use that. Otherwise use default.

Token can be provided via:

1. --token flag (highest priority)
INLINECODE6 tokens map (by --token-name)
INLINECODE8 env var (fallback)

Step 3: Search & Select Actor

Run the search script:

CODEBLOCK0

Output: ranked candidates with score, success rate, rating, pricing model.

Quality filters (built into script):

- notice = NONE (not deprecated)
30-day success rate ≥ 95%
30-day runs ≥ 1,000
User rating ≥ 4.0

Pick the top-ranked candidate. If user has a preference or prior experience with a specific Actor, skip search.

Step 4: Get Actor Schema & Build run_input

Fetch the Actor's documentation:

CODEBLOCK1

Read the input schema section. Construct run_input JSON based on:

- The Actor's required/optional fields
The user's targets and filters
Sensible defaults from the documentation

Do NOT ask the user to write JSON. Build it from their natural language request.

Step 5: Probe Test (Top 1 → Top 2 → Top 3 fallback)

Test with minimal input before committing to full run:

CODEBLOCK2

The probe automatically uses the first 2 items from the list field.

Checks:

- Run starts successfully (no permission/billing errors)
Run completes (no timeout/crash)
Returns non-empty data

If probe fails → try next candidate Actor. If all 3 fail → report to user with Actor URLs for manual activation.

Step 6: Full Execution

CODEBLOCK3

Key flags:

Flag	Purpose	Default
INLINECODE11	Field in run_input containing the list to batch	None (no batching)
INLINECODE12

Batching rules:

- ≤ batch-size items → single run
\> batch-size items → auto-split, 3s pause between batches
Each batch has independent timeout (default 10 min)

Step 7: Return Results

- Report total items collected
Save raw JSON to specified output path
Summarize key stats (items count, batches, any failures)
Let the caller handle filtering/reporting/delivery

Common Actor Patterns

Platform	Typical Actor	list_key	Example input
Instagram	INLINECODE18	INLINECODE19	INLINECODE20
TikTok

clockworks/tiktok-scraper | hashtags | {"hashtags": ["cooking"], "resultsPerPage": 50} | | Reddit | trudax/reddit-scraper-lite | startUrls | {"startUrls": [{"url": "https://reddit.com/r/cooking/top/?t=month"}], "maxItems": 30} | | Twitter | apidojo/tweet-scraper | — | Check .md for current schema |

These are starting points. Always verify with the Actor's .md page for current schema.

Apify 技能

通过标准化工作流程运行任意 Apify Actor：搜索 → 验证 → 执行 → 收集结果。

前置条件

- APIFY_TOKEN 环境变量，或包含令牌的 config.json 文件（复制 config.json.example）
已安装 requests 库的 Python 3

工作流程

步骤 1：解析用户意图

从用户请求中提取：

- 平台/目标（Instagram、TikTok、Reddit 等）
抓取内容（帖子、个人资料、话题标签、评论等）
目标（URL、用户名、关键词）
数量/筛选条件（数量、时间范围、最低点赞数等）

步骤 2：选择令牌

如果用户指定了令牌名称或任务对应特定账户，则使用该令牌。否则使用 default。

令牌可通过以下方式提供：

1. --token 参数（最高优先级）
config.json 令牌映射（通过 --token-name）
APIFY_TOKEN 环境变量（后备方案）

步骤 3：搜索并选择 Actor

运行搜索脚本：

bash
python3 scripts/search_actor.py instagram scraper --top 3

输出：按评分、成功率、评级、定价模型排序的候选列表。

质量筛选条件（内置于脚本）：

- notice = NONE（未弃用）
30天成功率 ≥ 95%
30天运行次数 ≥ 1,000
用户评分 ≥ 4.0

选择排名最高的候选。如果用户有偏好或之前使用过特定 Actor，则跳过搜索。

步骤 4：获取 Actor 模式并构建 run_input

获取 Actor 的文档：

bash
webfetch https://apify.com/{actorid}.md

阅读输入模式部分。基于以下内容构建 run_input JSON：

- Actor 的必填/可选字段
用户的目标和筛选条件
文档中的合理默认值

不要要求用户编写 JSON。 根据他们的自然语言请求构建。

步骤 5：探测测试（Top 1 → Top 2 → Top 3 后备方案）

在提交完整运行之前，使用最小输入进行测试：

bash
python3 scripts/apifyrunner.py {actorid} \
--input {...} \
--token {token} \
--probe-only \
--list-key {key}

探测自动使用列表字段的前 2 个项目。

检查项：

- 运行成功启动（无权限/计费错误）
运行完成（无超时/崩溃）
返回非空数据

如果探测失败 → 尝试下一个候选 Actor。如果全部 3 个都失败 → 向用户报告并提供 Actor URL 以便手动激活。

步骤 6：完整执行

bash
python3 scripts/apifyrunner.py {actorid} \
--input {...} \
--token {token} \
--output /path/to/results.json \
--list-key {key} \
--batch-size 50 \
--probe

关键参数：

参数	用途	默认值
--list-key	run_input 中包含要分批处理的列表字段	无（不分批）
--batch-size

分批规则：

- ≤ batch-size 项目 → 单次运行
\> batch-size 项目 → 自动拆分，批次间暂停 3 秒
每批有独立的超时时间（默认 10 分钟）

步骤 7：返回结果

- 报告收集的项目总数
将原始 JSON 保存到指定的输出路径
汇总关键统计信息（项目数、批次数、任何失败）
由调用者处理筛选/报告/交付

常见 Actor 模式

平台	典型 Actor	list_key	示例输入
Instagram	apify/instagram-scraper	directUrls	{directUrls: [https://instagram.com/user/], resultsType: posts, resultsLimit: 3}
TikTok

clockworks/tiktok-scraper | hashtags | {hashtags: [cooking], resultsPerPage: 50} | | Reddit | trudax/reddit-scraper-lite | startUrls | {startUrls: [{url: https://reddit.com/r/cooking/top/?t=month}], maxItems: 30} | | Twitter | apidojo/tweet-scraper | — | 查看 .md 文件获取当前模式 |

这些是起点。始终通过 Actor 的 .md 页面验证当前模式。

apifyApify爬虫

apify

Apify Skill

Prerequisites

Workflow

Step 1: Parse User Intent

Step 2: Select Token

Step 3: Search & Select Actor

Step 4: Get Actor Schema & Build run_input

Step 5: Probe Test (Top 1 → Top 2 → Top 3 fallback)

Step 6: Full Execution

Step 7: Return Results

Common Actor Patterns

Apify 技能

前置条件

工作流程

步骤 1：解析用户意图

步骤 2：选择令牌

步骤 3：搜索并选择 Actor

步骤 4：获取 Actor 模式并构建 run_input

步骤 5：探测测试（Top 1 → Top 2 → Top 3 后备方案）

步骤 6：完整执行

步骤 7：返回结果

常见 Actor 模式

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

apifyApify爬虫

apify

Apify Skill

Prerequisites

Workflow

Step 1: Parse User Intent

Step 2: Select Token

Step 3: Search & Select Actor

Step 4: Get Actor Schema & Build run_input

Step 5: Probe Test (Top 1 → Top 2 → Top 3 fallback)

Step 6: Full Execution

Step 7: Return Results

Common Actor Patterns

Apify 技能

前置条件

工作流程

步骤 1：解析用户意图

步骤 2：选择令牌

步骤 3：搜索并选择 Actor

步骤 4：获取 Actor 模式并构建 run_input

步骤 5：探测测试（Top 1 → Top 2 → Top 3 后备方案）

步骤 6：完整执行

步骤 7：返回结果

常见 Actor 模式

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement