Project

epitome

0.01
No commit activity in last 3 years
No release in over 3 years
An implementation of the Lexrank Algorithm, which summarize corpus of text documents.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.9
~> 10.0

Runtime

 Project Readme

Epitome

A small gem to make your text shorter. It's an implementation of the Lexrank algorithm. You can use it on a single text, but lexrank is designed to be used on a collection of texts. But it works the same anyway.

Installation

Add this line to your application's Gemfile:

gem 'epitome'

And then execute:

$ bundle

Or install it yourself as:

$ gem install epitome

Usage

Firstly, you need to create some documents.

document_one = Epitome::Document.new("The cat likes catnip. He rolls and rolls")
document_two = Epitome::Document.new("The cat plays in front of the dog. The dog is placid.")

Then, organize your documents in a corpus

document_collection = [document_one, document_two]
@corpus = Epitome::Corpus.new(document_collection)

Finally, output the summary

@corpus.summary(length=3)

This returns a nice, short text.

Options

Summary options

You can pass options to set the length of the expected summary, and set the similarity threshold

@corpus.summary(5, 0.2)

The length is the number of sentences of the final output.

The threshold is a value between 0.1 and 0.3, but 0.2 is considered to give the best results (and thus the default value).

Stopword option

When creating the corpus, you can set the language of the stopword list to be used

@corpus = Epitome::Corpus.new(document_collection, "fr")

The default value is english "en". You can find more about the stopword filter here.

Contributing

  1. Fork it ( https://github.com/[my-github-username]/hemingway/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request