referer-parser
referer-parser is a database for extracting marketing attribution data (such as search terms) from referer URLs, inspired by the ua-parser project (an equivalent library for user agent parsing).
The referer-parser project also contains multiple libraries for working with the referer-parser database in different languages.
referer-parser is a core component of Snowplow, the open-source web-scale analytics platform powered by Hadoop and Redshift.
Note that we always use the original HTTP misspelling of 'referer' (and thus 'referal') in this project - never 'referrer'.
Database
The database is available in YAML and JSON format.
The latest database is always available on this URL:
https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-latest.yaml https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-latest.json
The database is updated every day. Each new version of the database is also uploaded with a timestamp:
https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-YYYYMMDD.yaml https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-YYYYMMDD.json
Example:
https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-20200331.yaml https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/referers-20200331.json
Language-specific repositories
- Scala: https://github.com/snowplow-referer-parser/scala-referer-parser
- Java: https://github.com/snowplow-referer-parser/java-referer-parser
- Ruby: https://github.com/snowplow-referer-parser/ruby-referer-parser
- Python: https://github.com/snowplow-referer-parser/python-referer-parser
- NodeJS: https://github.com/snowplow-referer-parser/nodejs-referer-parser
- .NET: https://github.com/snowplow-referer-parser/dotnet-referer-parser
- PHP: https://github.com/snowplow-referer-parser/php-referer-parser
- Golang: https://github.com/snowplow-referer-parser/golang-referer-parser
- Erlang: https://github.com/silviucpp/refererparser
referers.yml
referer-parser identifies whether a URL is a known referer or not by checking it against the referers.yml
file; the intention is that this YAML file is reusable as-is by every language-specific implementation of referer-parser.
The file is broken out into sections for the different mediums that we support:
-
unknown
for when we know the source, but not the medium -
email
for webmail providers -
social
for social media services -
search
for search engines
Then within each section, we list each known provider (aka source
) by name, and then which domains each provider uses. For search engines, we also list the parameters used in the search engine URL to identify the search term
. For example:
Google: # Name of search engine referer
parameters:
- 'q' # First parameter used by Google
- 'p' # Alternative parameter used by Google
domains:
- google.co.uk # One domain used by Google
- google.com # Another domain used by Google
- ...
The number of referers and the domains they use is constantly growing - we need to keep referers.yml
up-to-date, and hope that the community will help!
Contributing
We welcome contributions to referer-parser:
-
New search engines and other referers - if you notice a search engine, social network or other site missing from
referers.yml
, please fork the repo, add the missing entry and submit a pull request - Ports of referer-parser to other languages - we welcome ports of referer-parser to new programming languages (e.g. Lua, Go, Haskell, C)
- Bug fixes, feature requests etc - much appreciated!
Please sign the Snowplow CLA before making pull requests.
Support
General support for referer-parser
is handled by Snowplow Analytics team on discourse.
Copyright and license
referers.yml
is based on Piwik's SearchEngines.php
and Socials.php
, copyright 2012 Matthieu Aubry and available under the GNU General Public License v3.