Project

quesadilla

0.05
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Entity-style text parsing
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

 Project Readme

Quesadilla

Build Status Coverage Status Code Climate Dependency Status Gem Version

Entity-style text parsing. Quesadilla was extracted from Cheddar.

See the Cheddar text guide for more information about how to type entities.

Quesadilla's API is fully documented. Read the online documentation.

Installation

Add this line to your application's Gemfile:

gem 'quesadilla'

And then execute:

$ bundle

Or install it yourself as:

$ gem install quesadilla

Usage

To extract entites from text, simply call extract:

Quesadilla.extract('Some #awesome text')
# => {
#   display_text: "Some #awesome text",
#   display_html: "Some <a href=\"#hashtag-awesome\" class=\"tag\">#awesome</a> text",
#   entities: [
#     {
#       type: "hashtag",
#       text: "#awesome",
#       display_text: "#awesome",
#       indices: [5, 13],
#       hashtag: "awesome",
#       display_indices: [5, 13]
#     }
#   ]
# }

Configuring

Quesadilla supports extracting various span-level Markdown features as well as automatically detecting links and GitHub-style named emoji. Here are the list of options you can pass when extracting:

Option Description Default
:markdown All Markdown parsing true
:markdown_code Markdown code tags true
:markdown_links Markdown links (including <http://soff.es> style links) true
:markdown_triple_emphasis Markdown bold italic true
:markdown_double_emphasis Markdown bold true
:markdown_emphasis Markdown italic true
:markdown_strikethrough Markdown Extra strikethrough true
:hashtags Hashtags true
:hashtags_validator Callable object to validate hashtags nil
:autolinks Automatically detect links true
:emoji GitHub-style named emoji true
:users User mentions false
:user_validator Callable object to validate usernames nil
:html Generate HTML representations for entities and the entire string true

Everything is enabled by deafult except user mentions. If you don't want to extract Markdown, you should call the extractor this like:

Quesadilla.extract('Some text', markdown: false)

You can also just disable strikethrough and still extract the rest of the Markdown entities if you want:

Quesadilla.extract('Some text', markdown_strikethrough: false)

Customizing HTML

If you want to change the generated HTML, you can create a custom renderer:

class CustomRenderer < Quesadilla::HTMLRenderer
  def hashtag(display_text, hashtag)
    %Q{<a href="http://example.com/tags/#{hashtag}" class="tag">#{display_text}</a>}
  end
end

extraction = Quesadilla.extract('Some #awesome text', html_renderer: CustomRenderer)
extraction[:display_html] #=> 'Some <a href="http://example.com/tags/awesome" class="tag">#awesome</a> text'

Take a look at Quesadilla::HTMLRenderer for more details on creating a custom renderer.

Users

To enable user mention extraction, pass users: true as an option. You can optionally pass a callable object to validate a username. Here's a quick example:

validator = lambda do |username|
  User.where('LOWER(username) = ?', username.downcase).first.try(:id)
end

extraction = extract('Real @soffes and fake @nobody', users: true, user_validator: validator)

Assuming there is a user named soffes in your database, it would extract @soffes. Assuming there isn't a user named nobody, that would remain plain text. Obviously feel free to do whatever you want here. Quesadilla makes no assumptions about your user system.

Note

User and hashtag detection use the twitter-text gem. This has some limits that you may not expect such as usernames can't be more than 20 characters and hashtags can't contain certain characters.

Supported Ruby Versions

Quesadilla is tested under 1.9.2, 1.9.3, 2.0.0, JRuby 1.7.2 (1.9 mode), and Rubinius 2.0.0 (1.9 mode).

Contributing

See the contributing guide.