The project is in a healthy, maintained state
A Ruby client library for interacting with ElevenLabs dubbing and voice synthesis APIs
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 2.0
~> 13.0
~> 3.0
~> 3.0

Runtime

 Project Readme

ElevenlabsClient

Gem Version

A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects, AI music composition, voice transformation, speech transcription, audio isolation, and advanced audio processing features.

See Architecture Documentation for details.

Features

🎙️ Core Audio Features

  • Text-to-Speech - Convert text to natural-sounding speech with timestamps
  • Speech-to-Speech - Transform audio from one voice to another (Voice Changer)
  • Speech-to-Text - Transcribe audio and video files with advanced features
  • Text-to-Dialogue - Multi-speaker conversations and dialogue generation
  • Voice Design - Create custom voices from text descriptions
  • Voice Management - Create, edit, and manage individual voices
  • Audio Isolation - Remove background noise from audio files
  • Forced Alignment - Get precise timing information for audio transcripts

🎬 Content Creation

  • Dubbing - Create dubbed versions of audio/video content
  • Sound Generation - AI-generated sound effects and ambient audio
  • Music Generation - AI-powered music composition and streaming
  • Audio Native - Create embeddable audio players for websites

🤖 Agents Platform (Conversational AI)

  • Agents - Create and manage AI conversational agents
  • Conversations - Handle real-time conversations and chat interactions
  • Knowledge Base - Upload and manage documents for agent knowledge
  • Tools - Define and manage tools that agents can use
  • Tests - Create and run tests for agent performance
  • Outbound Calling - Make automated phone calls with agents
  • Batch Calling - Execute large-scale calling campaigns
  • Phone Numbers - Manage phone numbers for voice agents
  • Widgets - Create embeddable chat widgets for websites
  • LLM Usage - Monitor and analyze language model usage
  • MCP Servers - Manage Model Context Protocol servers

📊 Admin & Management APIs

  • History - Manage and analyze your generated audio history
  • Usage - Monitor character usage and analytics
  • User - Access account information and subscription details
  • Voice Library - Browse and manage community shared voices
  • Models - List available models and their capabilities
  • Samples - Delete voice samples for content moderation
  • Service Accounts - Monitor service accounts and API keys
  • Webhooks - Monitor workspace webhooks and their health
  • Workspace Management - Manage workspace groups, invites, members, and resources
  • Pronunciation Dictionaries - Custom pronunciation rules

🔧 Technical Features

  • WebSocket Streaming - Real-time audio streaming with low latency
  • Multiple Output Formats - Support for various audio formats
  • Flexible Configuration - Environment-based and programmatic configuration
  • Comprehensive Error Handling - Detailed error messages and status codes
  • Well-tested - Extensive test coverage with integration tests

Installation

Add this line to your application's Gemfile:

gem 'elevenlabs_client'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install elevenlabs_client

Quick Start

Configuration

Rails Applications (Recommended)

Create config/initializers/elevenlabs_client.rb:

ElevenlabsClient::Settings.configure do |config|
  config.properties = {
    elevenlabs_base_uri: ENV["ELEVENLABS_BASE_URL"],
    elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"]
  }
end

Set your environment variables:

export ELEVENLABS_API_KEY="your_api_key_here"
export ELEVENLABS_BASE_URL="https://api.elevenlabs.io"  # Optional, defaults to official API

Direct Configuration

# Global configuration (recommended)
ElevenlabsClient.configure do |config|
  config.api_key = "your_api_key_here"
  config.base_url = "https://api.elevenlabs.io"
  config.timeout = 30
  config.retry_count = 3
end

# Use globally configured client
client = ElevenlabsClient.client

# Or pass directly to client instance
client = ElevenlabsClient.new(
  api_key: "your_api_key_here",
  base_url: "https://api.elevenlabs.io",
  timeout: 60
)

# Legacy Settings support (still works)
ElevenlabsClient.configure do |config|
  config.properties = {
    elevenlabs_base_uri: "https://api.elevenlabs.io",
    elevenlabs_api_key: "your_api_key_here"
  }
end

Basic Usage

# Initialize client (uses configured settings)
client = ElevenlabsClient.new

# Text-to-Speech
audio_data = client.text_to_speech.convert("21m00Tcm4TlvDq8ikWAM", "Hello, world!")
File.open("hello.mp3", "wb") { |f| f.write(audio_data) }

# Dubbing
File.open("video.mp4", "rb") do |file|
  result = client.dubs.create(
    file_io: file,
    filename: "video.mp4",
    target_languages: ["es", "fr", "de"]
  )
end

# Dialogue Generation
dialogue = [
  { text: "Hello, how are you?", voice_id: "voice_1" },
  { text: "I'm doing great, thanks!", voice_id: "voice_2" }
]
audio_data = client.text_to_dialogue.convert(dialogue)

# Sound Generation
audio_data = client.sound_generation.generate("Ocean waves crashing on rocks")

# Voice Design
design_result = client.text_to_voice.design("Warm, professional female voice")
generated_voice_id = design_result["previews"].first["generated_voice_id"]

# Stream the voice preview
client.text_to_voice.stream_preview(generated_voice_id) do |chunk|
  puts "Received preview chunk: #{chunk.bytesize} bytes"
end

voice_result = client.text_to_voice.create(
  "Professional Voice",
  "Warm, professional female voice",
  generated_voice_id
)

# List Available Models
models = client.models.list
fastest_model = models["models"].min_by { |m| m["token_cost_factor"] }
puts "Fastest model: #{fastest_model['name']}"

# Voice Management
voices = client.voices.list
puts "Total voices: #{voices['voices'].length}"

# Create custom voice from audio samples
File.open("sample1.mp3", "rb") do |sample|
  voice = client.voices.create("My Voice", [sample], description: "Custom narrator voice")
  puts "Created voice: #{voice['voice_id']}"
end

# Admin APIs - Account Management
user_info = client.user.get_user
puts "Account: #{user_info['subscription']['tier']} (#{user_info['subscription']['status']})"
puts "Usage: #{user_info['subscription']['character_count']} / #{user_info['subscription']['character_limit']}"

# Usage Analytics
usage_stats = client.usage.get_character_stats(
  start_unix: (Time.now - 7.days).to_i * 1000,
  end_unix: Time.now.to_i * 1000,
  breakdown_type: "voice"
)
puts "7-day usage: #{usage_stats['usage']['All'].sum} characters"

# History Management
history = client.history.list(page_size: 10)
puts "Recent history: #{history['history'].length} items"

# Voice Library
voices = client.voice_library.get_shared_voices(category: "professional", page_size: 5)
puts "Professional voices available: #{voices['voices'].length}"

# Admin Samples Management
client.samples.delete_sample(voice_id: "voice_id", sample_id: "sample_id")
puts "Sample deleted successfully"

# Service Accounts Monitoring
accounts = client.service_accounts.get_service_accounts
puts "Service accounts: #{accounts['service-accounts'].length}"

# Webhooks Management
webhooks = client.webhooks.list_webhooks(include_usages: true)
puts "Active webhooks: #{webhooks['webhooks'].length}"

# Music Generation
music_data = client.music.compose(
  prompt: "Upbeat electronic dance track with synthesizers",
  music_length_ms: 30000
)
File.open("generated_music.mp3", "wb") { |f| f.write(music_data) }

# Speech-to-Speech (Voice Changer)
File.open("input_audio.mp3", "rb") do |audio_file|
  converted_audio = client.speech_to_speech.convert(
    "target_voice_id", 
    audio_file, 
    "input_audio.mp3",
    remove_background_noise: true
  )
  File.open("converted_audio.mp3", "wb") { |f| f.write(converted_audio) }
end

# Speech-to-Text Transcription
File.open("audio.mp3", "rb") do |audio_file|
  transcription = client.speech_to_text.create(
    "scribe_v1",
    file: audio_file,
    filename: "audio.mp3",
    diarize: true,
    timestamps_granularity: "word"
  )
  puts "Transcribed: #{transcription['text']}"
  
  # Get the transcript later
  transcript = client.speech_to_text.get_transcript(transcription['transcription_id'])
  
  # Delete when no longer needed
  client.speech_to_text.delete_transcript(transcription['transcription_id'])
end

# Audio Isolation (Background Noise Removal)
File.open("noisy_audio.mp3", "rb") do |audio_file|
  clean_audio = client.audio_isolation.isolate(audio_file, "noisy_audio.mp3")
  File.open("clean_audio.mp3", "wb") { |f| f.write(clean_audio) }
end

# Audio Native (Embeddable Player)
File.open("article.html", "rb") do |html_file|
  project = client.audio_native.create(
    "My Article",
    file: html_file,
    filename: "article.html",
    voice_id: "voice_id",
    auto_convert: true
  )
  puts "Player HTML: #{project['html_snippet']}"
end

# Forced Alignment
File.open("speech.wav", "rb") do |audio_file|
  alignment = client.forced_alignment.create(
    audio_file,
    "speech.wav",
    "Hello world, this is a test transcript"
  )
  
  alignment['words'].each do |word|
    puts "#{word['text']}: #{word['start']}s - #{word['end']}s"
  end
end

# Streaming Text-to-Speech
client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
  # Process audio chunk in real-time
  puts "Received #{chunk.bytesize} bytes"
end

API Documentation

Core APIs

🎙️ Core Audio APIs

🎬 Content Creation APIs

🤖 Agents Platform APIs (Conversational AI)

📊 Admin & Management APIs

🔧 Advanced Features

Available Endpoints

Endpoint Description Documentation
client.dubs.* Audio/video dubbing DUBBING.md
client.text_to_speech.* Text-to-speech conversion TEXT_TO_SPEECH.md
client.text_to_speech_stream.* Streaming TTS TEXT_TO_SPEECH_STREAMING.md
client.text_to_dialogue.* Dialogue generation TEXT_TO_DIALOGUE.md
client.sound_generation.* Sound effect generation SOUND_GENERATION.md
client.music.* AI music composition and streaming MUSIC.md
client.text_to_voice.* Voice design and creation TEXT_TO_VOICE.md
client.voices.* Voice management (CRUD) VOICES.md
client.speech_to_speech.* Voice changer and audio transformation SPEECH_TO_SPEECH.md
client.speech_to_text.* Audio/video transcription SPEECH_TO_TEXT.md
client.audio_isolation.* Background noise removal AUDIO_ISOLATION.md
client.audio_native.* Embeddable audio players AUDIO_NATIVE.md
client.forced_alignment.* Audio-text timing alignment FORCED_ALIGNMENT.md
client.user.* User account and subscription information USER.md
client.usage.* Character usage analytics and monitoring USAGE.md
client.history.* Generated audio history management HISTORY.md
client.voice_library.* Community voice browsing and management VOICE_LIBRARY.md
client.models.* Model information and capabilities MODELS.md
client.workspace_groups.* Workspace user groups management WORKSPACE_GROUPS.md
client.workspace_invites.* Workspace invites management WORKSPACE_INVITES.md
client.workspace_members.* Workspace member management WORKSPACE_MEMBERS.md
client.workspace_resources.* Workspace resource sharing WORKSPACE_RESOURCES.md
client.pronunciation_dictionaries.* Manage pronunciation dictionaries PRONUNCIATION_DICTIONARIES.md
client.samples.* Voice sample deletion and content moderation SAMPLES.md
client.service_accounts.* Service account monitoring and management SERVICE_ACCOUNTS.md
client.webhooks.* Workspace webhook monitoring and health analysis WEBHOOKS.md

Configuration Options

Configuration Precedence

  1. Explicit parameters (highest priority)
  2. Settings.properties (configured via initializer)
  3. Environment variables (lowest priority)

Environment Variables

  • ELEVENLABS_API_KEY - Your ElevenLabs API key (required)
  • ELEVENLABS_BASE_URL - API base URL (optional, defaults to https://api.elevenlabs.io)

Custom Environment Variable Names

client = ElevenlabsClient.new(
  api_key_env: "CUSTOM_API_KEY_VAR",
  base_url_env: "CUSTOM_BASE_URL_VAR"
)

Error Handling

The client provides specific exception types for different error conditions:

begin
  result = client.text_to_speech.convert(voice_id, text)
rescue ElevenlabsClient::AuthenticationError
  puts "Invalid API key"
rescue ElevenlabsClient::RateLimitError
  puts "Rate limit exceeded"
rescue ElevenlabsClient::ValidationError => e
  puts "Invalid parameters: #{e.message}"
rescue ElevenlabsClient::APIError => e
  puts "API error: #{e.message}"
end

Exception Types

  • AuthenticationError - Invalid API key or authentication failure
  • RateLimitError - Rate limit exceeded
  • ValidationError - Invalid request parameters
  • NotFoundError - Resource not found (e.g., voice ID, transcript ID)
  • BadRequestError - Bad request with invalid parameters
  • UnprocessableEntityError - Request cannot be processed (e.g., invalid file format)
  • APIError - General API errors

Rails Integration

The gem is designed to work seamlessly with Rails applications. See the examples directory for complete controller implementations and the Rails initializer example for configuration setup:

Development

After checking out the repo, run:

bin/setup          # Install dependencies
bundle exec rspec  # Run tests

Available Rake Tasks

# Testing
rake spec                    # Run all tests (default)
rake test:unit              # Run unit tests only
rake test:integration       # Run integration tests only

# Security
rake dev:security           # Run security checks
rake dev:audit              # Run bundler-audit

# Development
rake dev:test               # Run all tests
rake dev:coverage           # Run tests with coverage
rake release:prepare        # Run full CI suite locally

Continuous Integration

This gem uses GitHub Actions for CI/CD with the following checks:

  • Tests: Runs on Ruby 3.0, 3.1, 3.2, and 3.3
  • Security: bundler-audit for dependency vulnerability scanning
  • Build: Verifies gem can be built and installed

All checks must pass before merging pull requests.

To install this gem onto your local machine:

bundle exec rake install

To release a new version:

  1. Update the version number in version.rb
  2. Update CHANGELOG.md
  3. Run bundle exec rake release:prepare to verify tests and security checks pass
  4. Run bundle exec rake release

Testing

The gem includes comprehensive test coverage with RSpec:

# Run all tests
bundle exec rspec

# Run specific test files
bundle exec rspec spec/elevenlabs_client/endpoints/
bundle exec rspec spec/elevenlabs_client/client
bundle exec rspec spec/integration/

# Run with documentation format
bundle exec rspec --format documentation

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yourusername/elevenlabs_client.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

License

The gem is available as open source under the terms of the MIT License.

Changelog

See CHANGELOG.md for a detailed list of changes and version history.

Support


Made with ❤️ for the Ruby community