The project is in a healthy, maintained state
A Ruby library for parsing email signatures.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 2.0
>= 0
>= 0
~> 3.0
>= 0
~> 13.0
~> 3.0

Runtime

~> 3.1
~> 2.5
~> 2.14
 Project Readme

EmailSignatureParser

A Ruby gem for parsing email signatures. The gem tries to find the signature based on the name, if available, or email address and try to extract as much information as it can

Prerequisites

This library uses ruby_postal, which uses libpostal. You need to install the libpostal C library. Make sure you have the following prerequisites

On Ubuntu/Debian

sudo apt-get install curl autoconf automake libtool pkg-config

On CentOS/RHEL

sudo yum install curl autoconf automake libtool pkgconfig

On Mac OSX

brew install curl autoconf automake libtool pkg-config

Installing libpostal

git clone https://github.com/openvenues/libpostal
cd libpostal
./bootstrap.sh
./configure --datadir=[...some dir with a few GB of space...]
make
sudo make install

# On Linux it's probably a good idea to run
sudo ldconfig

Installation

Add this line to your application's Gemfile:

gem 'email_signature_parser'

And then execute:

bundle install

Or install it yourself as:

gem install email_signature_parser

Usage

To extract information from an email signature, you can extract in from an eml file, from the plain text of an email, or pass it the

require 'email_signature_parser'

result = EmailSignatureParser.from_file('/path/to/email.eml')
result = EmailSignatureParser.from_html('John Doe <jdoe@email.com>', email_body_html)
result = EmailSignatureParser.from_text('John Doe <jdoe@email.com>', email_text)

It will return a hash with whatever could be extracted from the signature

{
  "name": "John Doe",
  "email_address": "jdoe@testcompany.com",
  "address": "Alhambra Circle Street, 125, Coral Gables, FL, 33134 USA",
  "phones": [
    {
      "type": "Mobile",
      "phone_number": "+1 5056223073",
      "country": "US/CA"
    },
  ],
  "links": {
    "social_media": {
      "linkedin": "https://www.linkedin.com/company/testcompany/"
    },
    "other": [
    ]
  },
  "job_title": {
    "titles": ["Sales Marketing VP"],
    "acronyms": ["CEO"]
  },
  "text": "Text of the signature",
  "company_name": "TestCompany Ltd"
}

Enron Data

Ive tested this library, among other things using the enron data. You can get the data data. Running rake process_enron_data[input_path,output_path] will process all emails and generate json files (with a copy of the original email) for all signatures found.