0.0
No release in over a year
Durable-LLM is a unified interface for interacting with multiple Large Language Model APIs, simplifying integration of AI capabilities into Ruby applications.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 2.8
~> 5.0
~> 2.1
~> 1.0
~> 6.0

Runtime

> 1.0
~> 3.1
~> 2.6
~> 1.3
~> 2.6
 Project Readme

Durable-LLM

Durable-LLM is a Ruby gem providing a unified interface for interacting with multiple Large Language Model APIs. It simplifies the integration of AI capabilities into Ruby applications by offering a consistent way to access various LLM providers.

Installation

Add this line to your application's Gemfile:

gem 'durable-llm'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install durable-llm

Quick Start

Simple Completion

require 'durable-llm'

# Quick and simple - just provide an API key and model
client = Durable::Llm.new(:openai, api_key: 'your-api-key', model: 'gpt-4')
response = client.complete('What is the capital of France?')
puts response # => "The capital of France is Paris."

Using Global Configuration

require 'durable-llm'

# Configure once, use everywhere
Durable::Llm.configure do |config|
  config.openai.api_key = 'your-openai-api-key'
  config.anthropic.api_key = 'your-anthropic-api-key'
end

# Create clients without passing API keys
client = Durable::Llm.new(:openai, model: 'gpt-4')
response = client.complete('Hello, world!')
puts response

Chat Conversations

client = Durable::Llm.new(:openai, model: 'gpt-4')

response = client.chat(
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is Ruby?' }
  ]
)
puts response.choices.first.message.content

Streaming Responses

client = Durable::Llm.new(:openai, model: 'gpt-4')

client.stream(messages: [{ role: 'user', content: 'Count to 10' }]) do |chunk|
  content = chunk.dig('choices', 0, 'delta', 'content')
  print content if content
end
puts # New line after streaming

Embeddings

client = Durable::Llm.new(:openai)

response = client.embed(
  model: 'text-embedding-ada-002',
  input: 'Ruby is a dynamic programming language'
)

embedding = response.data.first.embedding
puts "Vector dimensions: #{embedding.length}"

Features

  • Unified Interface: Consistent API across 15+ LLM providers
  • Multiple Providers: OpenAI, Anthropic, Google, Cohere, Mistral, and more
  • Streaming Support: Real-time streaming responses
  • Embeddings: Generate text embeddings for semantic search
  • Configuration Management: Flexible configuration via environment variables or code
  • Error Handling: Comprehensive error types for precise handling
  • CLI Tool: Command-line interface for quick testing and exploration

Supported Providers

Durable-LLM supports the following LLM providers:

  • OpenAI - GPT-3.5, GPT-4, GPT-4 Turbo, and embeddings
  • Anthropic - Claude 3 (Opus, Sonnet, Haiku)
  • Google - Gemini models
  • Cohere - Command and embedding models
  • Mistral AI - Mistral models
  • Groq - Fast inference with various models
  • Fireworks AI - High-performance model serving
  • Together AI - Open-source model hosting
  • DeepSeek - DeepSeek models
  • OpenRouter - Access to multiple models via single API
  • Perplexity - Perplexity models
  • xAI - Grok models
  • Azure OpenAI - Microsoft Azure-hosted OpenAI models
  • HuggingFace - Open-source models via Hugging Face
  • OpenCode - Code-specialized models

Configuration

Environment Variables

Set API keys using environment variables with the DLLM__ prefix:

export DLLM__OPENAI__API_KEY=your-openai-key
export DLLM__ANTHROPIC__API_KEY=your-anthropic-key
export DLLM__GOOGLE__API_KEY=your-google-key

Programmatic Configuration

Configure API keys and settings in your code:

# Environment variables (recommended)
# export DLLM__OPENAI__API_KEY=your-openai-key
# export DLLM__ANTHROPIC__API_KEY=your-anthropic-key

# Or configure programmatically
Durable::Llm.configure do |config|
  config.openai.api_key = 'sk-...'
  config.anthropic.api_key = 'sk-ant-...'
  config.google.api_key = 'your-google-key'
  config.cohere.api_key = 'your-cohere-key'
end

# Create clients with automatic configuration
client = Durable::Llm.new(:openai)  # Uses configured API key
client = Durable::Llm.new(:anthropic, model: 'claude-3-sonnet-20240229')

Per-Client Configuration

Pass configuration directly when creating a client:

client = Durable::Llm.new(
  :openai,
  api_key: 'your-api-key',
  model: 'gpt-4',
  timeout: 120
)

API Reference

Client Methods

new(provider, options = {})

Creates a new LLM client for the specified provider.

Parameters:

  • provider (Symbol) - Provider name (:openai, :anthropic, etc.)
  • options (Hash) - Configuration options
    • :model - Default model to use
    • :api_key - API key for authentication
    • Other provider-specific options

Returns: Durable::Llm::Client instance

complete(text, opts = {})

Performs a simple text completion with minimal configuration.

Parameters:

  • text (String) - Input text to complete
  • opts (Hash) - Additional options (reserved for future use)

Returns: String with the completion text

Note: The older method name quick_complete is still supported as an alias for backward compatibility.

Example:

client = Durable::Llm.new(:openai, model: 'gpt-4')
response = client.complete('Explain quantum computing in one sentence')
puts response

completion(params = {})

Performs a completion request with full control over parameters.

Parameters:

  • params (Hash) - Completion parameters
    • :model - Model to use (overrides default)
    • :messages - Array of message hashes with :role and :content
    • :temperature - Sampling temperature (0.0-2.0)
    • :max_tokens - Maximum tokens to generate
    • Other provider-specific parameters

Returns: Response object with completion data

Example:

response = client.completion(
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'Write a Ruby method to reverse a string' }
  ],
  temperature: 0.7,
  max_tokens: 500
)

chat(params = {})

Alias for completion - performs a chat completion request.

stream(params = {}, &block)

Performs a streaming completion request, yielding chunks as they arrive.

Parameters:

  • params (Hash) - Same as completion
  • block - Block to process each chunk

Example:

client.stream(messages: [{ role: 'user', content: 'Write a story' }]) do |chunk|
  content = chunk.dig('choices', 0, 'delta', 'content')
  print content if content
end

embed(params = {})

Generates embeddings for the given text.

Parameters:

  • params (Hash) - Embedding parameters
    • :model - Embedding model to use
    • :input - Text or array of texts to embed

Returns: Response object with embedding vectors

Example:

response = client.embed(
  model: 'text-embedding-ada-002',
  input: ['First text', 'Second text']
)

embeddings = response.data.map(&:embedding)

CLI Tool

Durable-LLM includes a command-line tool (dllm) for quick interactions:

# One-shot completion
$ dllm prompt "What is Ruby?" -m gpt-3.5-turbo

# Interactive chat
$ dllm chat -m gpt-4

# List available models
$ dllm models

# Manage conversations
$ dllm conversations

Advanced Usage

Fluent API with Method Chaining

client = Durable::Llm.new(:openai, model: 'gpt-3.5-turbo')

# Chain configuration methods for cleaner code
result = client
  .with_model('gpt-4')
  .with_temperature(0.7)
  .with_max_tokens(500)
  .complete('Write a haiku about Ruby')

puts result

Response Helpers

Extract information from responses easily:

require 'durable-llm'

response = client.chat(messages: [{ role: 'user', content: 'Hello!' }])

# Extract content directly
content = Durable::Llm::ResponseHelpers.extract_content(response)

# Get token usage
tokens = Durable::Llm::ResponseHelpers.token_usage(response)
puts "Used #{tokens[:total_tokens]} tokens"

# Check why the response finished
reason = Durable::Llm::ResponseHelpers.finish_reason(response)
puts "Finished: #{reason}"

# Estimate cost
cost = Durable::Llm::ResponseHelpers.estimate_cost(response)
puts "Estimated cost: $#{cost}"

Provider Utilities

Discover and compare providers:

# Find which provider supports a model
provider = Durable::Llm::ProviderUtilities.provider_for_model('gpt-4')
# => :openai

# List all available providers
providers = Durable::Llm::ProviderUtilities.available_providers
# => [:openai, :anthropic, :google, ...]

# Check provider capabilities
Durable::Llm::ProviderUtilities.supports_capability?(:openai, :streaming)
# => true

# Get all provider info
info = Durable::Llm::ProviderUtilities.all_provider_info

Fallback Chains for Resilience

Create robust systems with automatic fallback:

# Execute with fallback providers
result = Durable::Llm::ProviderUtilities.complete_with_fallback(
  'What is Ruby?',
  providers: [:openai, :anthropic, :google],
  model_map: {
    openai: 'gpt-4',
    anthropic: 'claude-3-opus-20240229',
    google: 'gemini-pro'
  }
)

puts result

Global Convenience Functions

Quick access without module qualification:

require 'durable-llm'

# Quick client creation
client = DLLM(:openai, model: 'gpt-4')

# One-liner completions
result = LlmComplete('Hello!', model: 'gpt-4')

# Configure globally
LlmConfigure do |config|
  config.openai.api_key = 'sk-...'
end

# List models
models = LlmModels(:openai)

Custom Timeout

client = Durable::Llm.new(
  :openai,
  model: 'gpt-4',
  timeout: 120  # 2 minutes
)

Provider-Specific Options

Some providers support additional options:

# Azure OpenAI with custom endpoint
client = Durable::Llm.new(
  :azureopenai,
  api_key: 'your-key',
  endpoint: 'https://your-resource.openai.azure.com',
  api_version: '2024-02-15-preview'
)

Model Discovery

# Get list of available models for a provider
models = Durable::Llm.models(:openai)
puts models.inspect

# Or using provider utilities
models = Durable::Llm::ProviderUtilities.models_for_provider(:anthropic)

Cloning Clients with Different Settings

base_client = Durable::Llm.new(:openai, model: 'gpt-3.5-turbo')

# Create variant clients for different use cases
fast_client = base_client.clone_with(model: 'gpt-3.5-turbo')
powerful_client = base_client.clone_with(model: 'gpt-4')

# Use them for different tasks
summary = fast_client.complete('Summarize: ...')
analysis = powerful_client.complete('Analyze in depth: ...')

Practical Examples

Building a Simple Chatbot

require 'durable-llm'

client = Durable::Llm.new(:openai, model: 'gpt-4')
conversation = [{ role: 'system', content: 'You are a helpful assistant.' }]

loop do
  print "You: "
  user_input = gets.chomp
  break if user_input.downcase == 'exit'

  conversation << { role: 'user', content: user_input }

  response = client.chat(messages: conversation)
  assistant_message = Durable::Llm::ResponseHelpers.extract_content(response)

  conversation << { role: 'assistant', content: assistant_message }
  puts "Assistant: #{assistant_message}"
end

Multi-Provider Text Analysis

require 'durable-llm'

text = "Ruby is a dynamic, open source programming language with a focus on simplicity and productivity."

providers = {
  openai: 'gpt-4',
  anthropic: 'claude-3-opus-20240229',
  google: 'gemini-pro'
}

results = providers.map do |provider, model|
  client = Durable::Llm.new(provider, model: model)
  response = client.complete("Summarize this in 5 words: #{text}")

  { provider: provider, model: model, summary: response }
end

results.each do |result|
  puts "#{result[:provider]} (#{result[:model]}): #{result[:summary]}"
end

Batch Processing with Progress Tracking

require 'durable-llm'

client = Durable::Llm.new(:openai, model: 'gpt-3.5-turbo')

texts = [
  "Ruby is elegant",
  "Python is versatile",
  "JavaScript is ubiquitous"
]

results = texts.map.with_index do |text, i|
  puts "Processing #{i + 1}/#{texts.length}..."

  response = client.chat(
    messages: [{ role: 'user', content: "Expand on: #{text}" }]
  )

  content = Durable::Llm::ResponseHelpers.extract_content(response)
  tokens = Durable::Llm::ResponseHelpers.token_usage(response)

  {
    input: text,
    output: content,
    tokens: tokens[:total_tokens]
  }
end

total_tokens = results.sum { |r| r[:tokens] }
puts "\nProcessed #{results.length} texts using #{total_tokens} tokens"

Sentiment Analysis with Error Handling

require 'durable-llm'

def analyze_sentiment(text)
  client = Durable::Llm.new(:openai, model: 'gpt-4')

  prompt = <<~PROMPT
    Analyze the sentiment of this text and respond with only one word:
    positive, negative, or neutral.

    Text: #{text}
  PROMPT

  response = client.complete(prompt)
  response.strip.downcase
rescue Durable::Llm::RateLimitError => e
  puts "Rate limited, waiting..."
  sleep 5
  retry
rescue Durable::Llm::APIError => e
  puts "API error: #{e.message}"
  'unknown'
end

texts = [
  "I love this product!",
  "This is terrible.",
  "It's okay, I guess."
]

texts.each do |text|
  sentiment = analyze_sentiment(text)
  puts "\"#{text}\" -> #{sentiment}"
end

Code Generation Assistant

require 'durable-llm'

client = Durable::Llm.new(:openai, model: 'gpt-4')

def generate_code(description)
  prompt = <<~PROMPT
    Generate Ruby code for: #{description}

    Provide only the code, no explanations.
  PROMPT

  client
    .with_temperature(0.3)  # Lower temperature for more deterministic code
    .with_max_tokens(500)
    .complete(prompt)
end

# Example usage
code = generate_code("a method that reverses a string")
puts code

Streaming Real-Time Translation

require 'durable-llm'

client = Durable::Llm.new(:openai, model: 'gpt-4')

def translate_streaming(text, target_language)
  messages = [{
    role: 'user',
    content: "Translate to #{target_language}: #{text}"
  }]

  print "Translation: "
  client.stream(messages: messages) do |chunk|
    content = chunk.dig('choices', 0, 'delta', 'content')
    print content if content
  end
  puts
end

translate_streaming("Hello, how are you?", "Spanish")
translate_streaming("The weather is nice today.", "French")

Acknowledgements

Thank you to the lite-llm and llm.datasette.io projects for their hard work, which was invaluable to this project. The dllm command line tool is patterned after the llm tool, though not as full-featured (yet).

The streaming jsonl code is from the ruby-openai repo; many thanks for their hard work.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/durableprogramming/durable-llm.

License

The gem is available as open source under the terms of the MIT License.