claudekit / tools / firecrawl
[ Tool · Data Engineering ]

Firecrawl

🔥 The API to search, scrape, and interact with the web for AI. Three integrated capabilities — Search, Scrape, Interact — exposed through one API. Open source under AGPL-3.0 and self-hostable via docker-compose, the engine also powers the firecrawl.dev cloud SaaS run by the same team.

firecrawl/firecrawl ·updated
$ git clone https://github.com/firecrawl/firecrawl && cd firecrawl && docker compose up copy

What it does

The infrastructure for “clean, LLM-ready data” from the live web is a real bottleneck for AI agents and RAG pipelines. General scrapers leave you to handle JavaScript rendering, complex markup, robots.txt, and multi-step interactions yourself — and the output rarely lands in a shape that an LLM can consume directly.

Firecrawl bundles that infrastructure into one API. Quoting firecrawl.dev: “the infrastructure layer that helps AI find, read, and act on the live web.” Output is returned as LLM-ready markdown or structured data from the start.

Key features — three integrated capabilities

  • Search — web search

    Run a query and get search results, with optional content extraction for each hit in the same call.

  • Scrape — page → clean data

    Extract a single URL into JSON, markdown, or branding formats. JavaScript rendering and complex markup are handled automatically.

  • Interact — page automation

    Automate clicks, typing, and navigation to reach content that static scraping cannot.

Additional endpoints include Agent (autonomous multi-source research), Crawl (multi-page extraction with depth and page limits), Map (discover indexed URLs on a site), and Batch Scrape (parallel processing of many URLs).

Cloud vs Open Source

AspectOpen Source (this repo)Cloud (firecrawl.dev)
OperatorYouFirecrawl team
LicenseAGPL-3.0 (SDKs / some UI = MIT)SaaS terms
Extra featuresCore engineAdditional cloud-only features (see README comparison)
CostYour infra cost1,000 credits/month free + paid plans
Data controlFull self-controlRouted through Firecrawl infrastructure
Best forStrict data residency, cost or customization controlFast start without infrastructure overhead

SDKs

LanguageInstall
Pythonpip install firecrawl-py
Node.jsnpm install @mendable/firecrawl-js
JavaJitPack via Gradle / Maven (com.github.firecrawl:firecrawl-java-sdk:2.0)
Elixir{:firecrawl, "~> 1.0"}
Rustfirecrawl = "2"

A community Go SDK is linked separately in the README.

Usage

Cloud (fastest start) — generate an API key at firecrawl.dev and call directly.

curl -X POST 'https://api.firecrawl.dev/v2/search' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"query": "firecrawl", "limit": 5}'

Self-host — use the docker-compose stack at the repo root.

git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up

See SELF_HOST.md in the repo for environment setup and dependencies.

From Claude Code — use the Firecrawl MCP. Point it at a self-hosted instance via FIRECRAWL_API_URL to keep the cloud out of the loop entirely.

Notes

  • AGPL-3.0 has real obligations — review the copyleft terms before integrating the engine source into a commercial product. Simply calling the API as a client (via MCP or SDK) is generally unaffected.
  • SDKs and some UI components are MIT — explicit in the README. Client-side integration draws only the MIT-licensed parts.
  • robots.txt respected by default — README quote: “Firecrawl respects robots.txt by default,” and: “It is the sole responsibility of end users to respect websites’ policies when scraping.”
  • Adoption — firecrawl.dev cites over one million signups and customers including Apple, Canva, and Lovable.
  • Actively maintained — near-daily commits since the first commit in April 2024.
§ 7

See also

same category · curated
[01]
[MCP] Firecrawl MCP · 🔥 Official Firecrawl MCP Server — Adds powerful web scraping and search to Cursor, Claude and any other LLM clients. Exposes 12+ tools spanning single-page scrape, batch processing, site crawl, search, structured extraction, autonomous research agent, and interactive page automation, returning clean LLM-ready markdown.
tool · claudekit.io / tools / firecrawl-mcp
[02]
[MCP] Perplexity MCP · The official MCP server implementation for the Perplexity API Platform. Provides AI assistants with real-time web search, reasoning, and research capabilities through Sonar models and the Search API.
tool · claudekit.io / tools / perplexity-mcp
[03]
[Skill] Graphify · Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.
tool · claudekit.io / tools / graphify
§ 8

Frequently Asked Questions

frequently asked
§ 8.1
What is Firecrawl?
Quoting the README: "The API to search, scrape, and interact with the web for AI." A full-stack backend service written in TypeScript, Python, Rust, and Java that both powers the [firecrawl.dev](https://firecrawl.dev) cloud SaaS and is open-source under AGPL-3.0 for anyone to self-host.
§ 8.2
Is it open source? What's the license?
Yes — published on GitHub under AGPL-3.0. From the README: "This project is primarily licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). The SDKs and some UI components are licensed under the MIT License." The core engine is AGPL; SDKs and some UI components are MIT.
§ 8.3
How does it relate to firecrawl.dev?
firecrawl.dev is the cloud SaaS run by the same Firecrawl team — a hosted version of this engine with additional cloud-only features (see the README's "Open Source vs Cloud" comparison). The free plan starts at 1,000 credits per month.
§ 8.4
How do I self-host it?
Use the `docker-compose.yaml` in the repo root and follow the `SELF_HOST.md` guide. It runs as a containerized stack (with services like Redis as dependencies). Not a single `docker run`, but lighter than bare-metal infrastructure deployment.
§ 8.5
Which SDKs are available?
Officially supported: Python (`firecrawl-py`), Node.js (`@mendable/firecrawl-js`), Java (Gradle/Maven via JitPack), Elixir (`firecrawl`), and Rust (`firecrawl`). A community Go SDK is also linked in the README.
§ 8.6
How do I use it from Claude Code?
Through the [Firecrawl MCP](/en/tools/firecrawl-mcp/). The MCP server can target either the cloud (`FIRECRAWL_API_KEY`) or a self-hosted instance (`FIRECRAWL_API_URL`), so you can use your own deployment from inside Claude as well.