Project

llama_cpp

0.06
There's a lot of open issues
llama_cpp.rb provides Ruby bindings for the llama.cpp.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies
 Project Readme

llama_cpp.rb

Gem Version License Documentation

llama_cpp.rb provides Ruby bindings for the llama.cpp.

This gem is still under development and may undergo many changes in the future.

Installation

Install the gem and add to the application's Gemfile by executing:

$ bundle add llama_cpp

If bundler is not being used to manage dependencies, install the gem by executing:

$ gem install llama_cpp

There are several installation options:

# use OpenBLAS
$ gem install llama_cpp -- --with-openblas

# use cuBLAS
$ gem install llama_cpp -- --with-cublas

Those options are defined in extconf.rb by with_config method.

Usage

Prepare the quantized model by refering to the usage section on the llama.cpp README. For example, preparing the quatization model based on open_llama_7b is as follows:

$ cd ~/
$ brew install git-lfs
$ git lfs install
$ git clone https://github.com/ggerganov/llama.cpp.git
$ cd llama.cpp
$ python3 -m pip install -r requirements.txt
$ cd models
$ git clone https://huggingface.co/openlm-research/open_llama_7b
$ cd ../
$ python3 convert.py models/open_llama_7b
$ make
$ ./quantize ./models/open_llama_7b/ggml-model-f16.gguf ./models/open_llama_7b/ggml-model-q4_0.bin q4_0

An example of Ruby code that generates sentences with the quantization model is as follows:

require 'llama_cpp'

model_params = LLaMACpp::ModelParams.new
model = LLaMACpp::Model.new(model_path: '/home/user/llama.cpp/models/open_llama_7b/ggml-model-q4_0.bin', params: model_params)

context_params = LLaMACpp::ContextParams.new
context_params.seed = 42
context = LLaMACpp::Context.new(model: model, params: context_params)

puts LLaMACpp.generate(context, 'Hello, World.')

Examples

There is a sample program in the examples directory that allow interactvie communication like ChatGPT.

$ git clone https://github.com/yoshoku/llama_cpp.rb.git
$ cd examples
$ bundle install
$ ruby chat.rb --model /home/user/llama.cpp/models/open_llama_7b/ggml-model-q4_0.bin --seed 2023
...
User: Who is the originator of the Ruby programming language?
Bob: The originator of the Ruby programming language is Mr. Yukihiro Matsumoto.
User:

llama_cpp_chat_example

Japanse chat is also possible using the Vicuna model on Hugging Face.

$ wget https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/resolve/main/ggml-vicuna-7b-1.1-q4_0.bin
$ ruby chat.rb --model ggml-vicuna-7b-1.1-q4_0.bin --file prompt_jp.txt

llama_cpp rb-jpchat

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yoshoku/llama_cpp.rb. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the LlamaCpp project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.