Project

scripsi

0.0
No commit activity in last 3 years
No release in over 3 years
a flexible text-searching library built on top of redis
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

>= 2.1.1
 Project Readme

Scripsi

A flexible text-searching library built on top of redis.

Sorted suffix indexing

Sorted suffix indexing allows you to search for any substring within a set of documents. First, index a collection of documents and associated ids.

require 'scripsi'
Scripsi.connect  # connect to a running redis server

ssi = Scripsi::SortedSuffixIndexer.new "myindexer"
ssi.index('1',"Epistulam ad te scripsi.")
ssi.index('2',"I've written you a letter.")
ssi.index('3',"Quisnam Tusculo espistulam me misit?")
ssi.index('4',"Who in Tusculum would've sent me a letter?")

You can then search for any substring, and the indexer will return the ids of the documents where that substring appears.

ssi = Scripsi.indexer "myindexer"
ssi.search("te")        # => ["1","2","4"]
ssi.search("Tuscul")    # => ["3","4"]
ssi.search("Tusculu")   # => ["4"]
ssi.search("you a le")  # => ["2"]

If we want to get more information about the match, we can use the matches method:

match = ssi.matches("you a le").first
match.doc    # => "2"
match.start  # => 13
match.end    # => 21

ssi.documents[match.doc][match.start...match.end]  # => "you a le"

You can also retrive the stored documents efficiently:

ssi.documents  # lazy list of documents
ssi.documents['3']  # document with id '3'