Firecrawl CLI

Use the firecrawl CLI to fetch and search the web. Firecrawl returns clean markdown optimized for LLM context windows, handles JavaScript rendering, bypasses common blocks, and provides structured data.

Installation

Check status, auth, and rate limits:

CODEBLOCK0

Output when ready:

CODEBLOCK1

- Concurrency: Max parallel jobs. Run parallel operations close to this limit but not above.
Credits: Remaining API credits. Each scrape/crawl consumes credits.

If not installed: INLINECODE1

Always refer to the installation rules in rules/install.md for more information if the user is not logged in.

Authentication

If not authenticated, run:

CODEBLOCK2

The --browser flag automatically opens the browser for authentication without prompting.

Organization

Create a .firecrawl/ folder in the working directory unless it already exists to store results. Add .firecrawl/ to the .gitignore file if not already there. Always use -o to write directly to file (avoids flooding context):

CODEBLOCK3

Examples:

CODEBLOCK4

Commands

Search - Web search with optional scraping

CODEBLOCK5

Search Options:

Option	Description
INLINECODE7	Maximum results (default: 5, max: 100)
INLINECODE8

Scrape - Single page content extraction

CODEBLOCK6

Scrape Options:

Option	Description
INLINECODE16	Output format(s): markdown, html, rawHtml, links, screenshot, json
INLINECODE17

Crawl - Crawl an entire website

CODEBLOCK7

Crawl Options:

Option	Description
INLINECODE24	Wait for crawl to complete
INLINECODE25

Map - Discover all URLs on a site

CODEBLOCK8

Map Options:

Option	Description
INLINECODE32	Maximum URLs to discover
INLINECODE33

Credit Usage

CODEBLOCK9

Reading Scraped Files

NEVER read entire firecrawl output files at once unless explicitly asked - they can be 1000+ lines. Instead, use grep, head, or incremental reads:

CODEBLOCK10

Parallelization

Run multiple scrapes in parallel using & and wait:

CODEBLOCK11

For many URLs, use xargs with -P for parallel execution:

CODEBLOCK12

Combining with Other Tools

CODEBLOCK13

Firecrawl CLI

使用 firecrawl CLI 获取和搜索网络内容。Firecrawl 返回针对 LLM 上下文窗口优化的干净 Markdown，处理 JavaScript 渲染，绕过常见拦截，并提供结构化数据。

安装

检查状态、认证和速率限制：

bash
firecrawl --status

就绪时的输出：

🔥 firecrawl cli v1.0.2

● 已通过 FIRECRAWLAPIKEY 认证
并发数：0/100 个任务（并行抓取限制）
积分：剩余 500,000

- 并发数：最大并行任务数。并行操作应接近此限制但不超过。
积分：剩余 API 积分。每次抓取/爬取都会消耗积分。

如果未安装：npm install -g firecrawl-cli

如果用户未登录，请始终参考 rules/install.md 中的安装规则获取更多信息。

认证

如果未认证，运行：

bash
firecrawl login --browser

--browser 标志会自动打开浏览器进行认证，无需提示。

组织

在工作目录中创建 .firecrawl/ 文件夹（如果尚不存在）以存储结果。如果尚未添加，将 .firecrawl/ 添加到 .gitignore 文件中。始终使用 -o 直接写入文件（避免上下文溢出）：

bash

搜索网络（最常见操作）

firecrawl search 你的查询 -o .firecrawl/search-{query}.json

启用抓取的搜索

firecrawl search 你的查询 --scrape -o .firecrawl/search-{query}-scraped.json

抓取页面

firecrawl scrape https://example.com -o .firecrawl/{site}-{path}.md

示例：

.firecrawl/search-reactservercomponents.json
.firecrawl/search-ai_news-scraped.json
.firecrawl/docs.github.com-actions-overview.md
.firecrawl/firecrawl.dev.md

命令

Search - 网络搜索（可选抓取）

bash

基本搜索（人类可读输出）

firecrawl search 你的查询 -o .firecrawl/search-query.txt

JSON 输出（推荐用于解析）

firecrawl search 你的查询 -o .firecrawl/search-query.json --json

限制结果数量

firecrawl search AI 新闻 --limit 10 -o .firecrawl/search-ai-news.json --json

搜索特定来源

firecrawl search 科技创业 --sources news -o .firecrawl/search-news.json --json firecrawl search 风景 --sources images -o .firecrawl/search-images.json --json firecrawl search 机器学习 --sources web,news,images -o .firecrawl/search-ml.json --json

按类别筛选（GitHub 仓库、研究论文、PDF）

firecrawl search python 网络抓取 --categories github -o .firecrawl/search-github.json --json firecrawl search transformer 架构 --categories research -o .firecrawl/search-research.json --json

基于时间的搜索

firecrawl search AI 公告 --tbs qdr:d -o .firecrawl/search-today.json --json # 过去一天 firecrawl search 科技新闻 --tbs qdr:w -o .firecrawl/search-week.json --json # 过去一周

基于位置的搜索

firecrawl search 餐厅 --location 旧金山,加利福尼亚,美国 -o .firecrawl/search-sf.json --json firecrawl search 本地新闻 --country DE -o .firecrawl/search-germany.json --json

搜索并抓取结果内容

firecrawl search firecrawl 教程 --scrape -o .firecrawl/search-scraped.json --json firecrawl search API 文档 --scrape --scrape-formats markdown,links -o .firecrawl/search-docs.json --json

搜索选项：

选项	描述
--limit <n>	最大结果数（默认：5，最大：100）
--sources <sources>

 | ISO 国家代码（默认：US） |

| --scrape | 启用搜索结果抓取 |

| --scrape-formats  | 启用 --scrape 时的抓取格式（默认：markdown） |

| -o, --output  | 保存到文件 |
Scrape - 单页面内容提取
bash

基本抓取（markdown 输出）


firecrawl scrape https://example.com -o .firecrawl/example.md
获取原始 HTML
firecrawl scrape https://example.com --html -o .firecrawl/example.html
多种格式（JSON 输出）
firecrawl scrape https://example.com --format markdown,links -o .firecrawl/example.json
仅主要内容（移除导航、页脚、广告）
firecrawl scrape https://example.com --only-main-content -o .firecrawl/example.md
等待 JS 渲染
firecrawl scrape https://spa-app.com --wait-for 3000 -o .firecrawl/spa.md
仅提取链接
firecrawl scrape https://example.com --format links -o .firecrawl/links.json
包含/排除特定 HTML 标签
firecrawl scrape https://example.com --include-tags article,main -o .firecrawl/article.md
firecrawl scrape https://example.com --exclude-tags nav,aside,.ad -o .firecrawl/clean.md
抓取选项：

选项 描述
-f, --format <formats> 输出格式：markdown, html, rawHtml, links, screenshot, json
-H, --html
 --format html 的快捷方式 |

| --only-main-content | 仅提取主要内容 |

| --wait-for  | 抓取前等待（用于 JS 内容） |

| --include-tags  | 仅包含特定 HTML 标签 |

| --exclude-tags  | 排除特定 HTML 标签 |

| -o, --output  | 保存到文件 |
Crawl - 爬取整个网站
bash

开始爬取（返回任务 ID）


firecrawl crawl https://example.com
等待爬取完成
firecrawl crawl https://example.com --wait
带进度指示
firecrawl crawl https://example.com --wait --progress
检查爬取状态
firecrawl crawl 
限制页面数量
firecrawl crawl https://example.com --limit 100 --max-depth 3
仅爬取博客部分
firecrawl crawl https://example.com --include-paths /blog,/posts
排除管理页面
firecrawl crawl https://example.com --exclude-paths /admin,/login
带速率限制的爬取
firecrawl crawl https://example.com --delay 1000 --max-concurrency 2
保存结果
firecrawl crawl https://example.com --wait -o crawl-results.json --pretty
爬取选项：

选项 描述
--wait 等待爬取完成
--progress
 等待时显示进度 |

| --limit  | 最大爬取页面数 |

| --max-depth  | 最大爬取深度 |

| --include-paths  | 仅爬取匹配的路径 |

| --exclude-paths  | 跳过匹配的路径 |

| --delay  | 请求之间的延迟 |

| --max-concurrency  | 最大并发请求数 |
Map - 发现网站上的所有 URL
bash

列出所有 URL（每行一个）


firecrawl map https://example.com -o .firecrawl/urls.txt
输出为 JSON
firecrawl map https://example.com --json -o .firecrawl/urls.json
搜索特定 URL
firecrawl map https://example.com --search blog -o .firecrawl/blog

选项	描述
-f, --format <formats>	输出格式：markdown, html, rawHtml, links, screenshot, json
-H, --html

firecrawl-cli火爬命令行

firecrawl-cli

Firecrawl CLI

Installation

Authentication

Organization

Commands

Search - Web search with optional scraping

Scrape - Single page content extraction

Crawl - Crawl an entire website

Map - Discover all URLs on a site

Credit Usage

Reading Scraped Files

Parallelization

Combining with Other Tools

Firecrawl CLI

安装

认证

组织

搜索网络（最常见操作）

启用抓取的搜索

抓取页面

命令

Search - 网络搜索（可选抓取）

基本搜索（人类可读输出）

JSON 输出（推荐用于解析）

限制结果数量

搜索特定来源

按类别筛选（GitHub 仓库、研究论文、PDF）

基于时间的搜索

基于位置的搜索

搜索并抓取结果内容

Scrape - 单页面内容提取

基本抓取（markdown 输出）

获取原始 HTML

多种格式（JSON 输出）

仅主要内容（移除导航、页脚、广告）

等待 JS 渲染

仅提取链接

包含/排除特定 HTML 标签

Crawl - 爬取整个网站

开始爬取（返回任务 ID）

等待爬取完成

带进度指示

检查爬取状态

限制页面数量

仅爬取博客部分

排除管理页面

带速率限制的爬取

保存结果

Map - 发现网站上的所有 URL

列出所有 URL（每行一个）

输出为 JSON

搜索特定 URL

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement