AACFlow

Firecrawl

Scrape, search, crawl, map, and extract web data

Usage Instructions

Integrate Firecrawl into the workflow. Scrape pages, search the web, crawl entire sites, map URL structures, and extract structured data with AI.

Tools

firecrawl_scrape

Extract structured content from web pages with comprehensive metadata support. Converts content to markdown or HTML while capturing SEO metadata, Open Graph tags, and page information.

Input

ParameterTypeRequiredDescription
urlstringYesThe URL to scrape content from (e.g., "https://example.com/page"\)
scrapeOptionsjsonNoNo description
apiKeystringYesNo description
pricingcustomNoNo description
metadatastringNoNo description
rateLimitstringNoNo description

Output

ParameterTypeDescription
markdownstringPage content in markdown format
htmlstringRaw HTML content of the page
metadataobjectPage metadata including SEO and Open Graph information
titlestringPage title
descriptionstringPage meta description
languagestringPage language code (e.g., "en")
sourceURLstringOriginal source URL that was scraped
statusCodenumberHTTP status code of the response
keywordsstringPage meta keywords
robotsstringRobots meta directive (e.g., "follow, index")
ogTitlestringOpen Graph title
ogDescriptionstringOpen Graph description
ogUrlstringOpen Graph URL
ogImagestringOpen Graph image URL
ogLocaleAlternatearrayAlternate locale versions for Open Graph
ogSiteNamestringOpen Graph site name
errorstringError message if scrape failed

Input

ParameterTypeRequiredDescription
querystringYesNo description
apiKeystringYesNo description
pricingcustomNoNo description
metadatastringNoNo description
rateLimitstringNoNo description

Output

ParameterTypeDescription
dataarraySearch results data with scraped content and metadata
titlestringSearch result title from search engine
descriptionstringSearch result description/snippet from search engine
urlstringURL of the search result
markdownstringPage content in markdown (when scrapeOptions.formats includes "markdown")
htmlstringProcessed HTML content (when scrapeOptions.formats includes "html")
rawHtmlstringUnprocessed raw HTML (when scrapeOptions.formats includes "rawHtml")
linksarrayLinks found on the page (when scrapeOptions.formats includes "links")
screenshotstringScreenshot URL (expires after 24 hours, when scrapeOptions.formats includes "screenshot")
metadataobjectMetadata about the search result page
titlestringPage title
descriptionstringPage meta description
sourceURLstringOriginal source URL
statusCodenumberHTTP status code
errorstringError message if scrape failed

firecrawl_crawl

Input

ParameterTypeRequiredDescription
urlstringYesThe website URL to crawl (e.g., "https://example.com" or "https://docs.example.com/guide"\)
limitnumberNoNo description
maxDepthnumberNoMaximum depth to crawl from the starting URL (e.g., 1, 2, 3). Controls how many levels deep to follow links
formatsjsonNoOutput formats for scraped content (e.g., ["markdown"], ["markdown", "html"], ["markdown", "links"])
excludePathsjsonNoURL paths to exclude from crawling (e.g., ["/blog/", "/admin/", "/*.pdf"])
includePathsjsonNoURL paths to include in crawling (e.g., ["/docs/", "/api/"]). Only these paths will be crawled
onlyMainContentbooleanNoNo description
apiKeystringYesNo description
pricingcustomNoNo description
metadatastringNoNo description
rateLimitstringNoNo description

Output

ParameterTypeDescription
pagesarrayArray of crawled pages with their content and metadata
markdownstringPage content in markdown format
htmlstringProcessed HTML content of the page
rawHtmlstringUnprocessed raw HTML content
linksarrayArray of links found on the page
screenshotstringScreenshot URL (expires after 24 hours)
metadataobjectPage metadata from crawl operation
titlestringPage title
descriptionstringPage meta description
languagestringPage language code
sourceURLstringOriginal source URL
statusCodenumberHTTP status code
ogLocaleAlternatearrayAlternate locale versions
totalnumberTotal number of pages found during crawl

firecrawl_map

Get a complete list of URLs from any website quickly and reliably. Useful for discovering all pages on a site without crawling them.

Input

ParameterTypeRequiredDescription
urlstringYesThe base URL to map and discover links from (e.g., "https://example.com"\)
searchstringNoFilter results by relevance to a search term (e.g., "blog")
sitemapstringNoControls sitemap usage: "skip", "include" (default), or "only"
includeSubdomainsbooleanNoNo description
ignoreQueryParametersbooleanNoNo description
limitnumberNoMaximum number of links to return (e.g., 100, 1000, 5000). Max: 100,000, default: 5,000
timeoutnumberNoNo description
locationjsonNoNo description
apiKeystringYesNo description
pricingcustomNoNo description
metadatastringNoNo description
rateLimitstringNoNo description

Output

ParameterTypeDescription
successbooleanWhether the mapping operation was successful
linksarrayArray of discovered URLs from the website

firecrawl_extract

Extract structured data from entire webpages using natural language prompts and JSON schema. Powerful agentic feature for intelligent data extraction.

Input

ParameterTypeRequiredDescription
urlsjsonYesArray of URLs to extract data from (e.g., ["https://example.com/page1", "https://example.com/page2"\] or ["https://example.com/*"\]\)
promptstringNoNo description
schemajsonNoNo description
enableWebSearchbooleanNoNo description
ignoreSitemapbooleanNoNo description
includeSubdomainsbooleanNoNo description
showSourcesbooleanNoNo description
ignoreInvalidURLsbooleanNoNo description
scrapeOptionsjsonNoNo description
apiKeystringYesNo description
pricingcustomNoNo description
metadatastringNoNo description
rateLimitstringNoNo description

Output

ParameterTypeDescription
successbooleanWhether the extraction operation was successful
dataobjectExtracted structured data according to the schema or prompt

firecrawl_agent

Autonomous web data extraction agent. Searches and gathers information based on natural language prompts without requiring specific URLs.

Input

ParameterTypeRequiredDescription
promptstringYesNo description
urlsjsonNoOptional array of URLs to focus the agent on (e.g., ["https://example.com", "https://docs.example.com"\]\)
schemajsonNoNo description
maxCreditsnumberNoNo description
strictConstrainToURLsbooleanNoNo description
apiKeystringYesNo description

Output

ParameterTypeDescription
successbooleanWhether the agent operation was successful
statusstringCurrent status of the agent job (processing, completed, failed)
dataobjectExtracted data from the agent
expiresAtstringTimestamp when the results expire (24 hours)
sourcesobjectArray of source URLs used by the agent

On this page

Start building today
Trusted by over 100,000 builders.
The SaaS platform to build AI agents and run your agentic workforce.
Get started