Find dead-links (broken links)
Dead link (broken link) means a link within a web page that cannot be connected. These links can have a negative impact to SEO and Security. This tool makes it easy to identify and modify.
Installation
Install with Gem
CLI
gem install deadfinder
# https://rubygems.org/gems/deadfinder
Gemfile
gem 'deadfinder'
# and `bundle install`
Install with Homebrew
brew install deadfinder
# https://formulae.brew.sh/formula/deadfinder
Docker Image
docker pull ghcr.io/hahwul/deadfinder:latest
Using In
CLI
deadfinder sitemap https://www.hahwul.com/sitemap.xml
Github Action
steps:
- name: Run DeadFinder
uses: hahwul/deadfinder@1.8.0
# or uses: hahwul/deadfinder@latest
id: broken-link
with:
command: sitemap # url / file / sitemap
target: https://www.hahwul.com/sitemap.xml
# timeout: 10
# concurrency: 50
# silent: false
# headers: "X-API-Key: 123444"
# worker_headers: "User-Agent: Deadfinder Bot"
# include30x: false
# user_agent: "Apple"
# proxy: "http://localhost:8070"
# proxy_auth: "id:pw"
# match:
# ignore:
- name: Output Handling
run: echo '${{ steps.broken-link.outputs.output }}'
If you have found a Dead Link and want to automatically add it as an issue, please refer to the "Automating Dead Link Detection" article.
Ruby Code
require 'deadfinder'
runner = DeadFinder::Runner.new
options = runner.default_options
options['concurrency'] = 30
DeadFinder.run_url('https://www.hahwul.com/cullinan/csrf/', options)
puts DeadFinder.output
# {"https://www.hahwul.com/cullinan/csrf/" => ["https://www.hahwul.com/tag/cullinan/"]}
For various examples and detailed usage, including sitemap, file, and other modes, please refer to the rubydoc and examples directory in the repository.
Usage
Commands:
deadfinder completion <SHELL> # Generate completion script for shell.
deadfinder file <FILE> # Scan the URLs from File. (e.g., deadfinder file urls.txt)
deadfinder help [COMMAND] # Describe available commands or one specific command
deadfinder pipe # Scan the URLs from STDIN. (e.g., cat urls.txt | deadfinder pipe)
deadfinder sitemap <SITEMAP-URL> # Scan the URLs from sitemap.
deadfinder url <URL> # Scan the Single URL.
deadfinder version # Show version.
Options:
-r, [--include30x], [--no-include30x], [--skip-include30x] # Include 30x redirections
# Default: false
-c, [--concurrency=N] # Number of concurrency
# Default: 50
-t, [--timeout=N] # Timeout in seconds
# Default: 10
-o, [--output=OUTPUT] # File to write result (e.g., json, yaml, csv)
-f, [--output-format=OUTPUT_FORMAT] # Output format
# Default: json
-H, [--headers=one two three] # Custom HTTP headers to send with initial request
[--worker-headers=one two three] # Custom HTTP headers to send with worker requests
[--user-agent=USER_AGENT] # User-Agent string to use for requests
# Default: Mozilla/5.0 (compatible; DeadFinder/1.8.0;)
-p, [--proxy=PROXY] # Proxy server to use for requests
[--proxy-auth=PROXY_AUTH] # Proxy server authentication credentials
-m, [--match=MATCH] # Match the URL with the given pattern
-i, [--ignore=IGNORE] # Ignore the URL with the given pattern
-s, [--silent], [--no-silent], [--skip-silent] # Silent mode
# Default: false
-v, [--verbose], [--no-verbose], [--skip-verbose] # Verbose mode
# Default: false
[--debug], [--no-debug], [--skip-debug] # Debug mode
# Default: false
[--limit=N] # Limit the number of URLs to scan
# Default: 0
[--coverage], [--no-coverage], [--skip-coverage] # Enable coverage tracking and reporting
# Default: false
Modes
# Scan the URLs from STDIN (multiple URLs)
cat urls.txt | deadfinder pipe
# Scan the URLs from File. (multiple URLs)
deadfinder file urls.txt
# Scan the Single URL.
deadfinder url https://www.hahwul.com
# Scan the URLs from sitemap. (multiple URLs)
deadfinder sitemap https://www.hahwul.com/sitemap.xml
JSON Handling
deadfinder sitemap https://www.hahwul.com/sitemap.xml \
-o output.json
cat output.json | jq
{
"Target URL": [
"DeadLink URL",
"DeadLink URL",
"DeadLink URL"
]
}
With --coverage
flag:
deadfinder sitemap https://www.hahwul.com/sitemap.xml --coverage -o output.json
{
"dead_links": {
"Target URL": [
"DeadLink URL 1",
"DeadLink URL 2",
"DeadLink URL 3",
"DeadLink URL 4",
"DeadLink URL 5",
"DeadLink URL 6",
"DeadLink URL 7",
]
},
"coverage": {
"targets": {
"Target URL": {
"total_tested": 14,
"dead_links": 7,
"coverage_percentage": 50.0
}
},
"summary": {
"total_tested": 14,
"total_dead": 7,
"overall_coverage_percentage": 50.0
}
}
}
SBOM (Software Bill of Materials)
DeadFinder includes automatic SBOM generation using CycloneDX format. When releases are published, a bom.xml
file is automatically generated and attached as a release asset.
The SBOM includes:
- All runtime and development dependencies
- Component versions and licenses
- Package URLs (purl) for traceability
- SHA-256 hashes for integrity verification
Manual SBOM Generation
You can manually generate an SBOM for development purposes:
# Install dependencies
bundle install
# Generate SBOM
bundle exec cyclonedx-ruby -p .
# SBOM will be created as bom.xml
The generated SBOM follows the CycloneDX 1.1 specification and can be used with various security scanning and compliance tools.
Contributions Welcome!
We welcome contributions from everyone! If you have an idea for an improvement or want to report a bug:
- Fork the repository.
-
Create a new branch for your feature or bug fix (e.g.,
feature/awesome-feature
orbugfix/annoying-bug
). - Make your changes.
- Commit your changes with a clear commit message.
- Push to the branch.
-
Submit a Pull Request (PR) to our
main
branch.
We'll review your PR as soon as possible. Thank you for contributing to our project!