Firecrawl
The Firecrawl gem provides a Ruby interface to the Firecrawl API, enabling you to scrape web pages, capture screenshots, and crawl entire websites. The API returns clean, structured content in formats like Markdown and HTML, making it particularly useful for applications that need to process web content, including those using Large Language Models for grounding or real-time information retrieval.
require 'firecrawl'
Firecrawl.api_key ENV[ 'FIRECRAWL_API_KEY' ]
response = Firecrawl.scrape( 'https://example.com' )
if response.success?
result = response.result
puts result.metadata[ 'title' ]
puts result.markdown
endTable of Contents
- Installation
- Quick Start
- Endpoints
- Scrape
- Batch Scrape
- Map
- Crawl
- Extract
- Responses and Errors
- Connections
- License
Installation
Add this line to your application's Gemfile:
gem 'firecrawl'Then execute:
bundle installOr install it directly:
gem install firecrawlQuick Start
The simplest way to use the gem is through the module-level convenience methods. Set your API key once, then call any endpoint:
require 'firecrawl'
Firecrawl.api_key ENV[ 'FIRECRAWL_API_KEY' ]
response = Firecrawl.scrape( 'https://example.com' )
if response.success?
puts response.result.markdown
endFor more control, instantiate request objects directly. This allows you to configure options using a block-based DSL and reuse request instances:
request = Firecrawl::ScrapeRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
options = Firecrawl::ScrapeOptions.build do
formats [ :markdown, :html, :screenshot ]
only_main_content true
end
response = request.submit( 'https://example.com', options )Endpoints
Scrape
The scrape endpoint fetches a single URL and returns the page content in one or more formats. You can optionally run browser actions before content is captured.
options = Firecrawl::ScrapeOptions.build do
formats [ :markdown, :screenshot ]
only_main_content true
screenshot do
full_page true
end
end
request = Firecrawl::ScrapeRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
response = request.submit( 'https://example.com', options )
if response.success?
result = response.result
puts result.markdown
puts result.screenshot_url
endFor complete documentation of all scrape options and response fields, see Scrape Documentation.
Batch Scrape
The batch scrape endpoint processes multiple URLs efficiently. It returns results asynchronously, so you poll for completion:
request = Firecrawl::BatchScrapeRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
urls = [ 'https://example.com', 'https://example.org' ]
options = Firecrawl::BatchScrapeOptions.build do
formats [ :markdown ]
only_main_content true
end
response = request.submit( urls, options )
while response.success?
result = response.result
result.each do | scrape_result |
puts scrape_result.markdown
end
break unless result.scraping?
sleep 1
response = request.retrieve( result )
endFor complete documentation of all batch scrape options and response fields, see Batch Scrape Documentation.
Map
The map endpoint retrieves a site's URL structure without scraping content. This is useful for discovering pages before scraping:
request = Firecrawl::MapRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
options = Firecrawl::MapOptions.build do
limit 100
include_subdomains false
end
response = request.submit( 'https://example.com', options )
if response.success?
response.result.each do | link |
puts link.url
end
endFor complete documentation of all map options and response fields, see Map Documentation.
Crawl
The crawl endpoint recursively scrapes an entire website. Like batch scrape, it returns results asynchronously:
request = Firecrawl::CrawlRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
options = Firecrawl::CrawlOptions.build do
maximum_depth 2
limit 50
scrape_options do
formats [ :markdown ]
only_main_content true
end
end
response = request.submit( 'https://example.com', options )
while response.success?
result = response.result
result.each do | scrape_result |
puts scrape_result.metadata[ 'title' ]
end
break unless result.crawling?
sleep 1
response = request.retrieve( result )
endFor complete documentation of all crawl options and response fields, see Crawl Documentation.
Extract
The extract endpoint uses LLM to pull structured data from URLs. Provide a prompt and/or JSON schema to define what data you want:
request = Firecrawl::ExtractRequest.new( api_key: ENV[ 'FIRECRAWL_API_KEY' ] )
options = Firecrawl::ExtractOptions.build do
prompt 'Extract the company name and description'
schema( {
type: 'object',
properties: {
name: { type: 'string' },
description: { type: 'string' }
}
} )
end
response = request.submit( 'https://example.com', options )
while response.success?
result = response.result
break unless result.processing?
sleep 2
response = request.retrieve( result )
end
if result.completed?
puts result.data
endFor complete documentation of all extract options and response fields, see Extract Documentation.
Responses and Errors
All request methods return a Faraday::Response object. Check response.success? to determine if the HTTP request succeeded. When successful, response.result contains the parsed result object specific to the endpoint.
response = request.submit( url, options )
if response.success?
result = response.result
if result.success?
# process result
end
else
error = response.result
puts error.error_type # :authentication_error, :rate_limit_error, etc.
puts error.error_description # human-readable message
endThe gem maps HTTP status codes to error types:
| Status | Error Type | Description |
|---|---|---|
| 400 | :invalid_request_error |
The request format or content was invalid |
| 401 | :authentication_error |
The API key is missing or invalid |
| 402 | :payment_required |
The account requires payment |
| 404 | :not_found_error |
The requested resource was not found |
| 429 | :rate_limit_error |
The account has exceeded rate limits |
| 500-505 | :api_error |
A server error occurred |
| 529 | :overloaded_error |
The service is temporarily overloaded |
Connections
The gem uses Faraday for HTTP requests, which means you can customize the connection configuration. To use a custom connection:
connection = Faraday.new do | faraday |
faraday.request :json
faraday.response :logger
faraday.adapter :net_http
end
Firecrawl.connection connectionOr pass it directly to a request:
request = Firecrawl::ScrapeRequest.new(
api_key: ENV[ 'FIRECRAWL_API_KEY' ],
connection: connection
)License
The gem is available as open source under the terms of the MIT License.