Group By Match Type
A Ruby gem for identifying and grouping CSV records based on matching email or phone number columns.
Installation
Install the gem by running:
gem install group_by_match_type
Or add this line to your application's Gemfile:
gem 'group_by_match_type'
And then execute:
bundle install
Usage
group_by_match_type INPUT_FILE MATCHING_TYPE [OUTPUT_FILE]
Available matching types:
-
same_email
: Groups records with matching email addresses -
same_phone
: Groups records with matching phone numbers -
same_email_or_phone
: Groups records that share either email or phone number
Examples:
# Match by email, default output
group_by_match_type contacts.csv same_email
# Match by phone, default output
group_by_match_type contacts.csv same_phone
# Match by either email or phone, default output
group_by_match_type contacts.csv same_email_or_phone
# Specify a custom output file location
group_by_match_type contacts.csv same_email ~/Downloads/my_grouped_contacts.csv
Output
The gem creates a new CSV file with all the original columns plus a new "group_id" column at the end. Records that are considered to be the same person based on the provided matching_type will have the same group_id. If you specify an OUTPUT_FILE
, the grouped CSV will be written to that location; otherwise, it will be written as *_grouped.csv
next to your input file.
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
Contributing
Bug reports and pull requests are welcome on GitHub. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
License
The gem is available as open source under the terms of the MIT License.