Project

email_data

0.0
The project is in a healthy, maintained state
This project is a compilation of datasets related to emails. Includes disposable emails, disposable domains, and free email services.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
 Project Readme

EmailData

  • Ruby: Tests Gem Gem
  • NPM: NPM package version NPM Downloads
  • License: License

This project is a compilation of datasets related to emails.

  • Disposable emails
  • Disposable domains
  • Free email services

The data is compiled from several different sources, so thank you all for making this data available.

Installation

Ruby

Add this line to your application's Gemfile:

gem "email_data"

And then execute:

$ bundle install

Or install it yourself as:

$ gem install email_data

Usage

require "email_data"

# <Pathname /> instance pointing to the data directory.
EmailData.data_dir

# List of disposable domains. Punycode is expanded into ASCII domains.
EmailData.disposable_domains

# List of disposable emails. Some services use free email like Gmail to create
# disposable emails.
EmailData.disposable_emails

# List of free email services.
EmailData.free_email_domains

Data sources

By default, Ruby will load data from filesystem. You may want to load this data from the database instead. email-data has support for ActiveRecord out of the box. To use the ActiveRecord adapter, you must load email_data/source/active_record.rb. You can easily do it so with Bundler's require key.

gem "email_data", require: "email_data/source/active_record"

Then, you need to assign the new data source.

EmailData.source = EmailData::Source::ActiveRecord

If you need to configure a different database connection than the one defined by ActiveRecord::Base, use EmailData::Source::ActiveRecord::ApplicationRecord for that.

Creating the tables

To create the tables, use the migration code below (tweak it accordingly if you use something different than PostgreSQL, or don't want to use citext).

class SetupEmailData < ActiveRecord::Migration[6.1]
  def change
    enable_extension "citext"

    create_table :tlds do |t|
      t.citext :name, null: false
    end

    add_index :tlds, :name, unique: true

    create_table :country_tlds do |t|
      t.citext :name, null: false
    end

    add_index :country_tlds, :name, unique: true

    create_table :disposable_emails do |t|
      t.citext :name, null: false
    end

    add_index :disposable_emails, :name, unique: true

    create_table :disposable_domains do |t|
      t.citext :name, null: false
    end

    add_index :disposable_domains, :name, unique: true

    create_table :free_email_domains do |t|
      t.citext :name, null: false
    end

    add_index :free_email_domains, :name, unique: true
  end
end
Loading the data

With PostgreSQL, you load the data using the COPY command. First, you'll need to discover where your gems are being installed. Use gem list for that.

$ gem list email_data -d

*** LOCAL GEMS ***

email_data (1601479967, 1601260789)
    Author: Nando Vieira
    Homepage: https://github.com/fnando/email_data
    License: MIT
    Installed at (1601479967): /usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0
    This project is a compilation of datasets related to emails.
    Includes disposable emails, disposable domains, and free email
    services.

The you can load each dataset using COPY:

COPY tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/tlds.txt';
COPY country_tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/country_tlds.txt';
COPY disposable_emails (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_emails.txt';
COPY disposable_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains.txt';
COPY free_email_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/free_email_domains.txt';

Alternatively, you could create a migrate that executes that same command; given that you'd be running Ruby code, you can replace the steps to find the gem path with EmailData.data_dir.

class LoadEmailData < ActiveRecord::Migration[6.1]
  def change
    copy = lambda do |table_name|
      connection = ActiveRecord::Base.connection
      data_path = EmailData.data_dir

      connection.execute <<~PG
        COPY #{table_name} (name)
        FROM '#{data_path.join(table_name)}.txt'
        (FORMAT CSV)
      PG
    end

    copy.call(:tlds)
    copy.call(:country_tlds)
    copy.call(:disposable_emails)
    copy.call(:disposable_domains)
    copy.call(:free_email_domains)
  end
end

Node.js

$ yarn add @fnando/email_data

or

$ npm install @fnando/email_data

Usage

const disposableEmails = require("@fnando/email_data/data/json/disposable_emails.json");
const disposableDomains = require("@fnando/email_data/data/json/disposable_domains.json");
const freeEmailDomains = require("@fnando/email_data/data/json/free_email_domains.json");

Dataset

The dataset is updated automatically. If you have any manual entries you would like to add, please make a pull request against the files data/manual/*.txt.

  • data/manual/disposable_domains.txt: only domains from disposable servers must go here.
  • data/manual/disposable_emails.txt: only normalized email addresses that use free email services must go here. E.g. d.i.s.p.o.s.a.b.l.e+1234@gmail.com must be added as disposable@gmail.com.
  • data/manual/free_email_domains.txt: only free email services must go here. These are services that allow anyone to create an email account, even if it's just a trial without credit cards.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/fnando/email_data. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the EmailData project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.