Blog

Niels Co-founder

Veröffentlicht am 11. März 2026Aktualisiert am 27. Mai 2026

Finden und kontaktieren Sie Ihre zukünftigen Kunden

All-in-one-Plattform für B2B-Prospektion

Jetzt testen →

Zurück zum Hub

Blog

Cloudflare /crawl: One API Call to Crawl an Entire Website

Niels Co-founder

Veröffentlicht am 11. März 2026Aktualisiert am 27. Mai 2026

At Emelia, our B2B prospecting tool, and Bridgers, our digital agency specializing in AI solutions, we build data pipelines that feed AI models every day. Web content extraction, prospect enrichment, automated competitive intelligence: web crawling sits at the core of our workflows. When Cloudflare drops an endpoint that can ingest an entire website in a single API call, it deserves a deep dive.

On March 10, 2026, Cloudflare launched /crawl, a new endpoint built into its Browser Rendering service. The announcement tweet from @CloudflareDev blew past 2 million impressions, 7,800 likes, and 8,600 bookmarks within 24 hours. The pitch is brutally simple: "One API call and an entire site crawled." No scripts. No browser management. Just the content in HTML, Markdown, or JSON.

How Does Cloudflare's /crawl API Work?

The system uses an asynchronous two-step process.

Step 1: Start the crawl. Send a POST request with a starting URL. The API immediately returns a job ID.

Step 2: Fetch results. Poll the API with that job ID using GET requests. Results stream in as pages are processed, with cursor-based pagination for large crawls.

The crawler automatically discovers URLs from three sources: the starting URL, the site's sitemap, and links found on each page. It respects robots.txt by default and identifies itself as a bot. Kathy Liao, Product Manager at Cloudflare, emphasized this repeatedly when facing community pushback:

Key Parameters

Parameter	Type	Description
url	String	Starting URL (required)
limit	Number	Maximum pages to crawl (default: 10, max: 100,000)
depth	Number	Maximum crawl depth (max: 100,000)
formats	Array	Output formats: html, markdown, json
render	Boolean	Execute JavaScript (default: true)
source	String	URL discovery: all, sitemaps, links
maxAge	Number	Cache duration in seconds (max: 7 days)
includePatterns	Array	Wildcard patterns to filter included URLs
excludePatterns	Array	Wildcard patterns to exclude URLs
modifiedSince	Number	Unix timestamp; only crawl pages modified after this date

The render: false option is a standout feature: it disables the headless browser and performs a simple HTTP fetch instead, making it significantly faster and cheaper. During the beta period, this mode is free.

Guide: Crawl a Website in One Line of Code

Here is how to launch a full crawl with curl:

```bash

# Step 1: Start the crawl curl -X POST \ "https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl" \ -H "Authorization: Bearer {api_token}" \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "limit": 50, "formats": ["markdown", "html"], "render": true }'

# Response: { "success": true, "result": "job-id-xxx" }

# Step 2: Fetch results curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl/job-id-xxx" \ -H "Authorization: Bearer {api_token}" ```

Each page in the response includes the URL, title, status, and content in your requested formats. For dynamic sites built with React, Vue, or Angular, the render: true mode launches a real headless Chrome instance that executes JavaScript before extracting content.

For structured JSON extraction, you can provide a prompt or a schema:

{
  "url": "https://shop.example.com",
  "formats": [
    "json"
  ],
  "jsonOptions": {
    "prompt": "Extract the product name, price, and description",
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "product",
        "schema": {
          "type": "object",
          "properties": {
            "name": {
              "type": "string"
            },
            "price": {
              "type": "number"
            },
            "description": {
              "type": "string"
            }
          }
        }
      }
    }
  }
}

This structured extraction uses Workers AI under the hood, which incurs additional costs.

Cloudflare Browser Rendering: Pricing and Limits

One of Cloudflare's strongest selling points is the price. Here is the full breakdown:

Free Plan (Workers Free)

Feature	Limit
Browser time	10 minutes per day
/crawl jobs per day	5
Max pages per crawl	100
REST API requests	6 per minute
Concurrent browsers	3

Paid Plan (Workers Paid, $5/month)

Feature	Limit
Browser hours included	10 hours/month
Extra browser time	$0.09/hour
REST API requests	600 per minute
Concurrent browsers	30
Max pages per crawl	100,000

The render: false mode (no JavaScript execution) is free during the beta and will later follow standard Workers pricing. Crawl jobs have a maximum runtime of 7 days, and results remain available for 14 days.

To put this in perspective: with the $5/month paid plan, you get 10 hours of browser rendering time included. If a 100-page crawl takes roughly 5 minutes of browser time, you can crawl approximately 12,000 pages per month for five dollars. Compare that to Firecrawl's Standard plan at $47/month for 100,000 pages, and the economics become compelling at scale.

Cloudflare /crawl vs Firecrawl vs Crawl4AI: Full Comparison

The web crawling market for AI applications is heating up fast. Here is how Cloudflare stacks up against the competition.

Feature	Cloudflare /crawl	Firecrawl	Crawl4AI	Jina Reader
Entry price	Free ($5/mo for paid plan)	Free (500 pages), then $19/mo	Free (open source)	Free (20 req/min without key)
Volume pricing	$0.09/browser hour	$47/mo (100k pages), $599/mo (1M pages)	Free (self-hosted)	Token-based (from $0.01/1M tokens)
Multi-page crawl	Yes (up to 100,000 pages)	Yes	Yes	No (single page)
Crawl depth	Up to 100,000 levels	Configurable	Configurable	N/A
Output formats	HTML, Markdown, JSON	HTML, Markdown, JSON, Screenshot	HTML, Markdown, JSON	Markdown, HTML
JavaScript rendering	Yes (headless Chrome)	Yes	Yes (Playwright/Chromium)	Yes (Puppeteer)
Structured AI extraction	Yes (Workers AI)	Yes (LLM extract)	Yes (LLM strategies)	No
Respects robots.txt	Yes (by default)	Optional	Configurable	Yes
Concurrent requests	30 (paid plan)	5 to 150 depending on plan	Unlimited (self-hosted)	2 to 500 depending on plan
Infrastructure	Serverless (Cloudflare edge)	Cloud SaaS	Self-hosted or Docker	Cloud SaaS
Open source	No	No	Yes (Apache 2.0)	Partially

When to Choose Cloudflare /crawl

If you are already in the Cloudflare ecosystem (Workers, R2, KV), integration is seamless. The cost-per-page is unbeatable for high-volume crawls thanks to time-based billing instead of per-page pricing. The render: false mode, free during beta, is perfect for static sites.

When to Choose Firecrawl

Firecrawl excels in developer experience with polished SDKs and AI-oriented features (LLM extraction, screenshots, site mapping). If you need a plug-and-play tool and do not want to manage infrastructure, it is a strong choice. However, per-page costs add up quickly at scale.

When to Choose Crawl4AI

With over 61,000 GitHub stars, Crawl4AI is the pick for teams that want total control. Open source, self-hosted, no rate limits imposed. Ideal for AI training pipelines or research projects on tight budgets.

When to Choose Jina Reader

Jina Reader is perfect for single-page conversion to LLM-friendly formats. Prepend https://r.jina.ai/ to any URL and you get clean Markdown. No native multi-page crawl, but unmatched simplicity for basic use cases.

Extract Website Data for AI with Cloudflare

Cloudflare's timing is not accidental. Demand for structured web data to feed AI models is exploding.

The crawl-to-refer ratio (how many times an AI bot visits a site versus how many visitors it sends back) has reached staggering levels: 1,700:1 for OpenAI, 73,000:1 for Anthropic according to Cloudflare's own data. AI bots are consuming web content at an industrial scale, and developers need reliable tools to do the same.

RAG Pipelines (Retrieval-Augmented Generation)

The most obvious use case is building knowledge bases for RAG systems. With /crawl, you can ingest an entire product documentation site in Markdown, chunk it, vectorize it, and inject it into an index so your AI agents answer with precision.

Automated Competitive Intelligence

Periodically crawl competitor websites to detect price changes, new products, or positioning shifts. The modifiedSince parameter lets you fetch only pages modified since your last crawl, enabling efficient differential crawls.

Large-Scale SEO Auditing

Extract all pages from a site to analyze title tags, meta descriptions, heading structure, internal links, and 404 errors. The JSON format with structured AI extraction delivers directly actionable data.

Real-World Use Cases for Cloudflare /crawl

Beyond theoretical scenarios, here are concrete use cases we are already seeing:

Content migration. Switching CMS platforms? Crawl the old site in Markdown, clean up the content, and import it into the new system. No more manual exports or unreliable plugins.

Compliance monitoring. Legal teams can automatically track legal notices, terms of service, and privacy policies across a portfolio of websites.

Training dataset construction. Machine Learning teams can build text corpora from public sources while respecting robots.txt, for fine-tuning specialized models.

Editorial content analysis. Marketing teams can analyze competitor content strategies: What topics do they cover? How frequently do they publish? What keywords are they targeting?

Knowledge base generation. Customer support teams can crawl their own documentation to build searchable knowledge bases. Feed the Markdown output into a vector database, connect it to a chatbot, and your support agents (human or AI) get instant access to every page of your docs.

Price monitoring at scale. E-commerce teams can track pricing across dozens of competitor sites. Use the JSON format with a prompt like "Extract product name and price" and get structured data ready for analysis, without writing custom parsers for each site.

The Cloudflare Irony: Selling the Lock and the Lockpick

The announcement sparked passionate reactions across the developer community. The company that built its reputation on anti-bot protection is now selling a crawling tool. As one SRE engineer put it:

A viral tweet from @TukiFromKL (496,000 impressions, 3,700 likes) called it the "biggest betrayal in tech this year." Kathy Liao's response from Cloudflare was immediate and unambiguous:

Cloudflare's position is clear: /crawl identifies as a bot, respects robots.txt, and does not bypass any anti-bot protections. If a site owner blocks bots, the crawl will fail. This is an approach that gives content owners control, unlike some crawlers that attempt to masquerade as human browsers.

Under the Hood: Technical Architecture

For developers who want to understand the internals, here are the key technical details.

The /crawl endpoint runs on Cloudflare's Browser Rendering infrastructure, which spins up headless Chrome instances across Cloudflare's global edge network. When you run a crawl with render: true, each page loads in a real browser instance, JavaScript executes, AJAX requests complete, and the final DOM is captured. This is what makes the tool capable of handling modern Single Page Applications (SPAs).

With render: false, the process is fundamentally different: Cloudflare performs a simple HTTP fetch via Workers, no browser involved. The result is raw HTML (no JavaScript rendering), but speed and cost are incomparable. This mode is ideal for documentation sites, static blogs, or any site that generates its HTML server-side.

The caching system is well designed. The maxAge parameter controls how long results are cached in R2 (Cloudflare's object storage). Matches are exact on URL. If you crawl the same site twice within the cache window, the second request is near-instant and consumes no browser time.

The modifiedSince parameter deserves special attention. It takes a Unix timestamp and only crawls pages modified after that date. Combined with caching, this enables extremely efficient differential crawls: one full initial pass, then incremental updates.

Finally, the filtering patterns (includePatterns and excludePatterns) use wildcards with * (one segment) and ** (all segments). For example, to crawl only a site's documentation: includePatterns: ["/docs/**"] and excludePatterns: ["/docs/legacy/**"]. Exclude rules always take priority over include rules.

The crawler also supports authentication via custom headers, cookies, and HTTP Basic Auth, letting you crawl password-protected staging environments or authenticated sections of a site. You can set a custom userAgent string and use rejectResourceTypes to block images, media, or fonts for faster crawls.

What Cloudflare /crawl Does Not Do

For a complete picture, here are the current limitations:

No image extraction. The /crawl endpoint returns text content only (HTML, Markdown, JSON). For screenshots, you need the separate /screenshot endpoint.

No protection bypass. If a site uses CAPTCHAs, Bot Fight Mode, or Cloudflare challenges, the crawl will be blocked. This is by design.

Open beta. The API is in open beta. Bugs exist. Some developers report "Crawl job not found" errors immediately after creating a job.

Limited free tier. The 5-job-per-day and 100-page-per-job limits on the free plan are restrictive for production use. The $5/month paid plan is nearly essential.

Who Should Use Cloudflare /crawl?

This is for you if you are building data pipelines for AI, need to programmatically crawl entire sites, are already in the Cloudflare ecosystem, or are looking for a cheaper alternative to Firecrawl at scale.

Skip it if you need to bypass anti-bot protections (this is not the tool for that), only need single-page conversion (Jina Reader will be simpler), or need total control over infrastructure (self-hosted Crawl4AI will be a better fit).

How to Get Started with Cloudflare /crawl

Here are the steps to start using the API:

Create a Cloudflare account at dash.cloudflare.com (free)
Generate an API token with Browser Rendering permissions in your account settings
Get your Account ID from the Workers dashboard
Launch your first crawl using the curl request described above
Upgrade to Workers Paid ($5/month) if you exceed free plan limits

The official documentation is available at developers.cloudflare.com/browser-rendering and covers all parameters, output formats, and advanced use cases.

The web is becoming an API for language models. Cloudflare, which handles over 20% of global web traffic, just built one of the most powerful taps to access it. And at $5 a month, that tap is open to everyone.

Entdecken Sie Emelia, Ihre All-in-One-Software für prospektion.

Meine Kampagne starten

Klare, transparente Preise ohne versteckte Kosten.

Keine Verpflichtung, Preise, die Ihnen helfen, Ihre Akquise zu steigern.

Start

37€

/Monat

Unbegrenztes E-Mail-Versand

1 LinkedIn-Konto verbinden

Unbegrenzte LinkedIn-Aktionen

E-Mail-Warm-up inklusive

Unbegrenztes Scraping

Unbegrenzte Kontakte

Grow

Beliebt

97€

/Monat

Unbegrenztes E-Mail-Versand

Bis zu 5 LinkedIn-Konten

Unbegrenzte LinkedIn-Aktionen

Unbegrenztes Warm-up

Unbegrenzte Kontakte

1 CRM-Integration

Scale

297€

/Monat

Unbegrenztes E-Mail-Versand

Bis zu 20 LinkedIn-Konten

Unbegrenzte LinkedIn-Aktionen

Unbegrenztes Warm-up

Unbegrenzte Kontakte

Multi-CRM-Verbindung

Unbegrenzte API-Aufrufe

Credits(optional)

Sie benötigen keine Credits, wenn Sie nur E-Mails senden oder auf LinkedIn-Aktionen ausführen möchten

Können verwendet werden für:

E-Mails finden

KI-Aktion

Nummern finden

E-Mails verifizieren

€19pro Monat

1,000

1,000 Gefundene E-Mails

1,000 KI-Aktionen

20 Nummern

4,000 Verifizierungen

5,000

10,000

50,000

100,000

1,000 Gefundene E-Mails

1,000 KI-Aktionen

20 Nummern

4,000 Verifizierungen

€19pro Monat

Entdecken Sie andere Artikel, die Sie interessieren könnten!

Alle Artikel ansehen

Blog

Veröffentlicht am 5. Apr. 2025

FullEnrich: Bewertungen, Preise und Alternativen, um böse Überraschungen zu vermeiden

Mathieu Co-founder

Software

Veröffentlicht am 11. Juli 2024

7 Alternativen zu Expandi, um Ihre Akquisitionskosten zu senken

Marie Head Of Sales

Software

Veröffentlicht am 22. Apr. 2024

Die 5 besten Alternativen zu Dropcontact für eine bessere B2B-Kundenakquise

Marie Head Of Sales

Software

Veröffentlicht am 31. März 2025

9 Alternativen zu UpLead, um Ihre Kundenakquise WIRKLICH anzukurbeln

Niels Co-founder

Software

Veröffentlicht am 8. März 2025

7 Alternativen zu Kaspr für Ihre B2B-Akquise 2026

Niels Co-founder

Software

Veröffentlicht am 26. Apr. 2024

Email Finder 2026: Die 9 besten Hunter.io-Alternativen

Marie Head Of Sales

Made with ❤ for Growth Marketers by Growth Marketers

Finden und kontaktieren Sie Ihre zukünftigen Kunden

Cloudflare /crawl: One API Call to Crawl an Entire Website

How Does Cloudflare's /crawl API Work?

Key Parameters

Guide: Crawl a Website in One Line of Code

Cloudflare Browser Rendering: Pricing and Limits

Free Plan (Workers Free)

Paid Plan (Workers Paid, $5/month)

Cloudflare /crawl vs Firecrawl vs Crawl4AI: Full Comparison

When to Choose Cloudflare /crawl

When to Choose Firecrawl

When to Choose Crawl4AI

When to Choose Jina Reader

Extract Website Data for AI with Cloudflare

RAG Pipelines (Retrieval-Augmented Generation)

Automated Competitive Intelligence

Large-Scale SEO Auditing

Real-World Use Cases for Cloudflare /crawl

The Cloudflare Irony: Selling the Lock and the Lockpick

Under the Hood: Technical Architecture

What Cloudflare /crawl Does Not Do

Who Should Use Cloudflare /crawl?

How to Get Started with Cloudflare /crawl

Entdecken Sie Emelia, Ihre All-in-One-Software für prospektion.

Klare, transparente Preise ohne versteckte Kosten.

Start

Grow

Scale

Credits(optional)

Entdecken Sie andere Artikel, die Sie interessieren könnten!

FullEnrich: Bewertungen, Preise und Alternativen, um böse Überraschungen zu vermeiden

7 Alternativen zu Expandi, um Ihre Akquisitionskosten zu senken

Die 5 besten Alternativen zu Dropcontact für eine bessere B2B-Kundenakquise

9 Alternativen zu UpLead, um Ihre Kundenakquise WIRKLICH anzukurbeln

7 Alternativen zu Kaspr für Ihre B2B-Akquise 2026

Email Finder 2026: Die 9 besten Hunter.io-Alternativen

Nützliche Links

Über uns

Features

Folgen Sie uns

Partner