Project

vecsearch

0.0
The project is in a healthy, maintained state
All-in-one simple vector search class for ruby.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 0
>= 0
 Project Readme

Vecsearch

Vecsearch is an all-in-one semantic search library for ruby that uses a 4-bit (Q4_1) quantization of gte-tiny by using bert.cpp (a GGML implementation of BERT via FFI), and an in-process FAISS index.

Vecsearch embeds pre-built dynamic libraries for libbert and libggml, as well as a quantized model checkpoint for gte-tiny (total size: 14MB).

Currently only ARM64 macOS is supported, purely because I haven't bothered to build other dylibs yet. There is nothing difficult about this.

Usage

gem 'vecsearch'
Vecsearch.new('hello', 'goodbye').nearest('howdy') #=> 'hello'
require 'vecsearch'

vs = Vecsearch.new
vs << "hello"
vs << "behold, a non-sequitur"
vs << "how's it goin'"

puts(vs.query("hey there", 2).inspect)
# ["hello", "how's it goin'"]

Bugs

Haha. Yes.

Performance

All of these are haphazardly measured on my 2021 M1 MacBook Pro.

  • Embedding a 1-token document: 1.2ms
  • Embedding a 512-token document: 72ms
  • Adding a document to the database: negligible
  • Querying an empty database: negligible
  • Querying a database with 1000 entries: negligible (plus time to embed query)
  • Querying a database with 10000 entries: 300μs (plus time to embed query)
  • Querying a database with 100000 entries: 3.2ms (plus time to embed query)

Limitations / TODO

  • Trying to embed a document over 512 tokens long segfaults.
  • I haven't got the mean-pooling part of gte-tiny working. It seems to work well enough without it but we should do that and assert that ours generates approximately the same embedding as the canonical model.
  • Batching looks unimplemented in bert.cpp; it would be nice for prefilling the index.
  • Add more builds for platforms other than darwin/amd64.
  • Probably add a way to fetch an unquantized model, maybe other models entirely?