Project

lex-ollama

0.0
No release in over 3 years
Connects LegionIO to Ollama local LLM server
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

>= 2.0
 Project Readme

lex-ollama

Ollama integration for LegionIO. Connects LegionIO to a local Ollama LLM server for text generation, chat completions, embeddings, and model management.

Installation

gem install lex-ollama

Functions

Completions

  • generate - Generate a text completion (POST /api/generate)
  • generate_stream - Stream a text completion with per-chunk callbacks

Chat

  • chat - Generate a chat completion with message history and tool support (POST /api/chat)
  • chat_stream - Stream a chat completion with per-chunk callbacks

Models

  • create_model - Create a model from another model, GGUF, or safetensors (POST /api/create)
  • list_models - List locally available models (GET /api/tags)
  • show_model - Show model details, template, parameters, license (POST /api/show)
  • copy_model - Copy a model to a new name (POST /api/copy)
  • delete_model - Delete a model and its data (DELETE /api/delete)
  • pull_model - Download a model from the Ollama library (POST /api/pull)
  • push_model - Upload a model to the Ollama library (POST /api/push)
  • list_running - List models currently loaded in memory (GET /api/ps)

Embeddings

  • embed - Generate embeddings from a model (POST /api/embed)

Blobs

  • check_blob - Check if a blob exists on the server (HEAD /api/blobs/:digest)
  • push_blob - Upload a binary blob to the server (POST /api/blobs/:digest)

Version

  • server_version - Retrieve the Ollama server version (GET /api/version)

Standalone Client

client = Legion::Extensions::Ollama::Client.new
# or with custom host
client = Legion::Extensions::Ollama::Client.new(host: 'http://remote:11434')

# Chat
result = client.chat(model: 'llama3.2', messages: [{ role: 'user', content: 'Hello!' }])

# Generate
result = client.generate(model: 'llama3.2', prompt: 'Why is the sky blue?')

# Embeddings
result = client.embed(model: 'all-minilm', input: 'Some text to embed')

# List models
result = client.list_models

# Streaming generate
client.generate_stream(model: 'llama3.2', prompt: 'Tell me a story') do |event|
  case event[:type]
  when :delta then print event[:text]
  when :done  then puts "\nDone!"
  end
end

# Streaming chat
client.chat_stream(model: 'llama3.2', messages: [{ role: 'user', content: 'Hello!' }]) do |event|
  print event[:text] if event[:type] == :delta
end

All API calls include automatic retry with exponential backoff on connection failures and timeouts.

Generate and chat responses include standardized usage: data:

result = client.generate(model: 'llama3.2', prompt: 'Hello')
result[:usage]  # => { input_tokens: 1, output_tokens: 5, total_duration: ..., ... }

Requirements

  • Ruby >= 3.4
  • LegionIO framework
  • Ollama running locally or on a remote host

License

MIT