ElevenlabsClient
A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects, AI music composition, voice transformation, speech transcription, audio isolation, and advanced audio processing features.
See Architecture Documentation for details.
Features
🎙️ Core Audio Features
- Text-to-Speech - Convert text to natural-sounding speech with timestamps
- Speech-to-Speech - Transform audio from one voice to another (Voice Changer)
- Speech-to-Text - Transcribe audio and video files with advanced features
- Text-to-Dialogue - Multi-speaker conversations and dialogue generation
- Voice Design - Create custom voices from text descriptions
- Voice Management - Create, edit, and manage individual voices
- Audio Isolation - Remove background noise from audio files
- Forced Alignment - Get precise timing information for audio transcripts
🎬 Content Creation
- Dubbing - Create dubbed versions of audio/video content
- Sound Generation - AI-generated sound effects and ambient audio
- Music Generation - AI-powered music composition and streaming
- Audio Native - Create embeddable audio players for websites
🤖 Agents Platform (Conversational AI)
- Agents - Create and manage AI conversational agents
- Conversations - Handle real-time conversations and chat interactions
- Knowledge Base - Upload and manage documents for agent knowledge
- Tools - Define and manage tools that agents can use
- Tests - Create and run tests for agent performance
- Outbound Calling - Make automated phone calls with agents
- Batch Calling - Execute large-scale calling campaigns
- Phone Numbers - Manage phone numbers for voice agents
- Widgets - Create embeddable chat widgets for websites
- LLM Usage - Monitor and analyze language model usage
- MCP Servers - Manage Model Context Protocol servers
📊 Admin & Management APIs
- History - Manage and analyze your generated audio history
- Usage - Monitor character usage and analytics
- User - Access account information and subscription details
- Voice Library - Browse and manage community shared voices
- Models - List available models and their capabilities
- Samples - Delete voice samples for content moderation
- Service Accounts - Monitor service accounts and API keys
- Webhooks - Monitor workspace webhooks and their health
- Workspace Management - Manage workspace groups, invites, members, and resources
- Pronunciation Dictionaries - Custom pronunciation rules
🔧 Technical Features
- WebSocket Streaming - Real-time audio streaming with low latency
- Multiple Output Formats - Support for various audio formats
- Flexible Configuration - Environment-based and programmatic configuration
- Comprehensive Error Handling - Detailed error messages and status codes
- Well-tested - Extensive test coverage with integration tests
Installation
Add this line to your application's Gemfile:
gem 'elevenlabs_client'And then execute:
$ bundle installOr install it yourself as:
$ gem install elevenlabs_clientQuick Start
Configuration
Rails Applications (Recommended)
Create config/initializers/elevenlabs_client.rb:
ElevenlabsClient::Settings.configure do |config|
config.properties = {
elevenlabs_base_uri: ENV["ELEVENLABS_BASE_URL"],
elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"]
}
endSet your environment variables:
export ELEVENLABS_API_KEY="your_api_key_here"
export ELEVENLABS_BASE_URL="https://api.elevenlabs.io" # Optional, defaults to official APIDirect Configuration
# Global configuration (recommended)
ElevenlabsClient.configure do |config|
config.api_key = "your_api_key_here"
config.base_url = "https://api.elevenlabs.io"
config.timeout = 30
config.retry_count = 3
end
# Use globally configured client
client = ElevenlabsClient.client
# Or pass directly to client instance
client = ElevenlabsClient.new(
api_key: "your_api_key_here",
base_url: "https://api.elevenlabs.io",
timeout: 60
)
# Legacy Settings support (still works)
ElevenlabsClient.configure do |config|
config.properties = {
elevenlabs_base_uri: "https://api.elevenlabs.io",
elevenlabs_api_key: "your_api_key_here"
}
endBasic Usage
# Initialize client (uses configured settings)
client = ElevenlabsClient.new
# Text-to-Speech
audio_data = client.text_to_speech.convert("21m00Tcm4TlvDq8ikWAM", "Hello, world!")
File.open("hello.mp3", "wb") { |f| f.write(audio_data) }
# Dubbing
File.open("video.mp4", "rb") do |file|
result = client.dubs.create(
file_io: file,
filename: "video.mp4",
target_languages: ["es", "fr", "de"]
)
end
# Dialogue Generation
dialogue = [
{ text: "Hello, how are you?", voice_id: "voice_1" },
{ text: "I'm doing great, thanks!", voice_id: "voice_2" }
]
audio_data = client.text_to_dialogue.convert(dialogue)
# Sound Generation
audio_data = client.sound_generation.generate("Ocean waves crashing on rocks")
# Voice Design
design_result = client.text_to_voice.design("Warm, professional female voice")
generated_voice_id = design_result["previews"].first["generated_voice_id"]
# Stream the voice preview
client.text_to_voice.stream_preview(generated_voice_id) do |chunk|
puts "Received preview chunk: #{chunk.bytesize} bytes"
end
voice_result = client.text_to_voice.create(
"Professional Voice",
"Warm, professional female voice",
generated_voice_id
)
# List Available Models
models = client.models.list
fastest_model = models["models"].min_by { |m| m["token_cost_factor"] }
puts "Fastest model: #{fastest_model['name']}"
# Voice Management
voices = client.voices.list
puts "Total voices: #{voices['voices'].length}"
# Create custom voice from audio samples
File.open("sample1.mp3", "rb") do |sample|
voice = client.voices.create("My Voice", [sample], description: "Custom narrator voice")
puts "Created voice: #{voice['voice_id']}"
end
# Admin APIs - Account Management
user_info = client.user.get_user
puts "Account: #{user_info['subscription']['tier']} (#{user_info['subscription']['status']})"
puts "Usage: #{user_info['subscription']['character_count']} / #{user_info['subscription']['character_limit']}"
# Usage Analytics
usage_stats = client.usage.get_character_stats(
start_unix: (Time.now - 7.days).to_i * 1000,
end_unix: Time.now.to_i * 1000,
breakdown_type: "voice"
)
puts "7-day usage: #{usage_stats['usage']['All'].sum} characters"
# History Management
history = client.history.list(page_size: 10)
puts "Recent history: #{history['history'].length} items"
# Voice Library
voices = client.voice_library.get_shared_voices(category: "professional", page_size: 5)
puts "Professional voices available: #{voices['voices'].length}"
# Admin Samples Management
client.samples.delete_sample(voice_id: "voice_id", sample_id: "sample_id")
puts "Sample deleted successfully"
# Service Accounts Monitoring
accounts = client.service_accounts.get_service_accounts
puts "Service accounts: #{accounts['service-accounts'].length}"
# Webhooks Management
webhooks = client.webhooks.list_webhooks(include_usages: true)
puts "Active webhooks: #{webhooks['webhooks'].length}"
# Music Generation
music_data = client.music.compose(
prompt: "Upbeat electronic dance track with synthesizers",
music_length_ms: 30000
)
File.open("generated_music.mp3", "wb") { |f| f.write(music_data) }
# Speech-to-Speech (Voice Changer)
File.open("input_audio.mp3", "rb") do |audio_file|
converted_audio = client.speech_to_speech.convert(
"target_voice_id",
audio_file,
"input_audio.mp3",
remove_background_noise: true
)
File.open("converted_audio.mp3", "wb") { |f| f.write(converted_audio) }
end
# Speech-to-Text Transcription
File.open("audio.mp3", "rb") do |audio_file|
transcription = client.speech_to_text.create(
"scribe_v1",
file: audio_file,
filename: "audio.mp3",
diarize: true,
timestamps_granularity: "word"
)
puts "Transcribed: #{transcription['text']}"
# Get the transcript later
transcript = client.speech_to_text.get_transcript(transcription['transcription_id'])
# Delete when no longer needed
client.speech_to_text.delete_transcript(transcription['transcription_id'])
end
# Audio Isolation (Background Noise Removal)
File.open("noisy_audio.mp3", "rb") do |audio_file|
clean_audio = client.audio_isolation.isolate(audio_file, "noisy_audio.mp3")
File.open("clean_audio.mp3", "wb") { |f| f.write(clean_audio) }
end
# Audio Native (Embeddable Player)
File.open("article.html", "rb") do |html_file|
project = client.audio_native.create(
"My Article",
file: html_file,
filename: "article.html",
voice_id: "voice_id",
auto_convert: true
)
puts "Player HTML: #{project['html_snippet']}"
end
# Forced Alignment
File.open("speech.wav", "rb") do |audio_file|
alignment = client.forced_alignment.create(
audio_file,
"speech.wav",
"Hello world, this is a test transcript"
)
alignment['words'].each do |word|
puts "#{word['text']}: #{word['start']}s - #{word['end']}s"
end
end
# Streaming Text-to-Speech
client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
# Process audio chunk in real-time
puts "Received #{chunk.bytesize} bytes"
endAPI Documentation
Core APIs
🎙️ Core Audio APIs
- Text-to-Speech API - Convert text to natural speech
- Text-to-Speech Streaming API - Real-time audio streaming
- Text-to-Speech with Timestamps - Speech synthesis with precise timing
- Speech-to-Speech API - Transform audio from one voice to another
- Speech-to-Text API - Transcribe audio and video files
- Text-to-Dialogue API - Multi-speaker conversations
- Text-to-Dialogue Streaming - Real-time dialogue generation
- Voice Design API - Design and create custom voices from text descriptions
- Voice Management API - Manage individual voices (CRUD operations)
- Audio Isolation API - Remove background noise from audio
- Forced Alignment API - Get precise timing information for transcripts
🎬 Content Creation APIs
- Dubbing API - Create dubbed versions of audio/video content
- Sound Generation API - AI-generated sound effects and ambient audio
- Music Generation API - AI-powered music composition and streaming
- Audio Native API - Create embeddable audio players for websites
🤖 Agents Platform APIs (Conversational AI)
- Agents Platform Overview - Complete conversational AI platform
- Agents API - Create and manage AI conversational agents
- Conversations API - Handle real-time conversations and chat interactions
- Knowledge Base API - Upload and manage documents for agent knowledge
- Tools API - Define and manage tools that agents can use
- Tests API - Create and run tests for agent performance
- Test Invocations API - Execute and monitor test runs
- Outbound Calling API - Make automated phone calls with agents
- Batch Calling API - Execute large-scale calling campaigns
- Phone Numbers API - Manage phone numbers for voice agents
- Widgets API - Create embeddable chat widgets for websites
- LLM Usage API - Monitor and analyze language model usage
- MCP Servers API - Manage Model Context Protocol servers
- Workspace API - Manage agent platform workspace settings
📊 Admin & Management APIs
- Admin APIs Overview - Complete administrative functionality
- User Management - Account information and subscription details
- Usage Analytics - Character usage monitoring and analytics
- History Management - Generated audio history management
- Voice Library - Community voice browsing and management
- Models API - List available models and capabilities
- Samples Management - Delete voice samples for content moderation
- Service Accounts - Monitor and manage service accounts
- Service Account API Keys - Manage API keys for service accounts
- Webhooks Management - Monitor workspace webhooks and their health
- Workspace Webhooks - Configure and manage workspace-level webhooks
- Workspace Groups - Manage user groups and members
- Workspace Invites - Invite users and revoke invitations
- Workspace Members - Update member attributes and roles
- Workspace Resources - Share/unshare resources across the workspace
- Pronunciation Dictionaries - Create, manage and download pronunciation dictionaries
🔧 Advanced Features
- WebSocket Streaming - Real-time audio streaming with WebSockets
Available Endpoints
| Endpoint | Description | Documentation |
|---|---|---|
client.dubs.* |
Audio/video dubbing | DUBBING.md |
client.text_to_speech.* |
Text-to-speech conversion | TEXT_TO_SPEECH.md |
client.text_to_speech_stream.* |
Streaming TTS | TEXT_TO_SPEECH_STREAMING.md |
client.text_to_dialogue.* |
Dialogue generation | TEXT_TO_DIALOGUE.md |
client.sound_generation.* |
Sound effect generation | SOUND_GENERATION.md |
client.music.* |
AI music composition and streaming | MUSIC.md |
client.text_to_voice.* |
Voice design and creation | TEXT_TO_VOICE.md |
client.voices.* |
Voice management (CRUD) | VOICES.md |
client.speech_to_speech.* |
Voice changer and audio transformation | SPEECH_TO_SPEECH.md |
client.speech_to_text.* |
Audio/video transcription | SPEECH_TO_TEXT.md |
client.audio_isolation.* |
Background noise removal | AUDIO_ISOLATION.md |
client.audio_native.* |
Embeddable audio players | AUDIO_NATIVE.md |
client.forced_alignment.* |
Audio-text timing alignment | FORCED_ALIGNMENT.md |
client.user.* |
User account and subscription information | USER.md |
client.usage.* |
Character usage analytics and monitoring | USAGE.md |
client.history.* |
Generated audio history management | HISTORY.md |
client.voice_library.* |
Community voice browsing and management | VOICE_LIBRARY.md |
client.models.* |
Model information and capabilities | MODELS.md |
client.workspace_groups.* |
Workspace user groups management | WORKSPACE_GROUPS.md |
client.workspace_invites.* |
Workspace invites management | WORKSPACE_INVITES.md |
client.workspace_members.* |
Workspace member management | WORKSPACE_MEMBERS.md |
client.workspace_resources.* |
Workspace resource sharing | WORKSPACE_RESOURCES.md |
client.pronunciation_dictionaries.* |
Manage pronunciation dictionaries | PRONUNCIATION_DICTIONARIES.md |
client.samples.* |
Voice sample deletion and content moderation | SAMPLES.md |
client.service_accounts.* |
Service account monitoring and management | SERVICE_ACCOUNTS.md |
client.webhooks.* |
Workspace webhook monitoring and health analysis | WEBHOOKS.md |
Configuration Options
Configuration Precedence
- Explicit parameters (highest priority)
- Settings.properties (configured via initializer)
- Environment variables (lowest priority)
Environment Variables
-
ELEVENLABS_API_KEY- Your ElevenLabs API key (required) -
ELEVENLABS_BASE_URL- API base URL (optional, defaults tohttps://api.elevenlabs.io)
Custom Environment Variable Names
client = ElevenlabsClient.new(
api_key_env: "CUSTOM_API_KEY_VAR",
base_url_env: "CUSTOM_BASE_URL_VAR"
)Error Handling
The client provides specific exception types for different error conditions:
begin
result = client.text_to_speech.convert(voice_id, text)
rescue ElevenlabsClient::AuthenticationError
puts "Invalid API key"
rescue ElevenlabsClient::RateLimitError
puts "Rate limit exceeded"
rescue ElevenlabsClient::ValidationError => e
puts "Invalid parameters: #{e.message}"
rescue ElevenlabsClient::APIError => e
puts "API error: #{e.message}"
endException Types
-
AuthenticationError- Invalid API key or authentication failure -
RateLimitError- Rate limit exceeded -
ValidationError- Invalid request parameters -
NotFoundError- Resource not found (e.g., voice ID, transcript ID) -
BadRequestError- Bad request with invalid parameters -
UnprocessableEntityError- Request cannot be processed (e.g., invalid file format) -
APIError- General API errors
Rails Integration
The gem is designed to work seamlessly with Rails applications. See the examples directory for complete controller implementations and the Rails initializer example for configuration setup:
-
Core Controllers
- DubsController - Complete dubbing workflow
- TextToSpeechController - TTS with error handling
- StreamingAudioController - Real-time streaming
- TextToDialogueController - Dialogue generation
- SoundGenerationController - Sound effects
- MusicController - AI music composition and streaming
- TextToVoiceController - Voice design and creation
- VoicesController - Voice management (CRUD operations)
- SpeechToSpeechController - Voice changer and audio transformation
- SpeechToTextController - Audio/video transcription with advanced features
- AudioIsolationController - Background noise removal and audio cleanup
- AudioNativeController - Embeddable audio players for websites
- ForcedAlignmentController - Audio-text timing alignment and subtitle generation
-
Admin Controllers - Complete administrative functionality:
- Admin::HistoryController - Generated audio history management and analytics
- Admin::UsageController - Character usage monitoring and analytics
- Admin::UserController - User account and subscription management
- Admin::VoiceLibraryController - Community voice browsing and management
- Admin::ModelsController - Model information and selection guidance
- Admin::SamplesController - Voice sample deletion and content moderation
- Admin::ServiceAccountsController - Service account monitoring and analytics
- Admin::ServiceAccountApiKeysController - API key management for service accounts
- Admin::WebhooksController - Workspace webhook monitoring and health analysis
- Admin::WorkspaceWebhooksController - Workspace-level webhook configuration
- Admin::WorkspaceGroupsController - User group management and permissions
- Admin::WorkspaceInvitesController - Workspace invitation management
- Admin::WorkspaceMembersController - Workspace member management and roles
- Admin::WorkspaceResourcesController - Resource sharing and permissions
- Admin::PronunciationDictionariesController - Custom pronunciation management
-
Agents Platform Controllers - Conversational AI functionality:
- AgentsPlatform::AgentsController - AI agent creation and management
- AgentsPlatform::ConversationsController - Real-time conversation handling
- AgentsPlatform::KnowledgeBaseController - Document and knowledge management
- AgentsPlatform::ToolsController - Agent tool management and configuration
- AgentsPlatform::TestsController - Agent testing and validation
- AgentsPlatform::TestInvocationsController - Test execution and monitoring
- AgentsPlatform::OutboundCallingController - Automated phone call management
- AgentsPlatform::BatchCallingController - Large-scale calling campaigns
- AgentsPlatform::PhoneNumbersController - Phone number management for agents
- AgentsPlatform::WidgetsController - Embeddable chat widget management
- AgentsPlatform::LlmUsageController - Language model usage analytics
- AgentsPlatform::McpServersController - Model Context Protocol server management
- AgentsPlatform::WorkspaceController - Agent platform workspace settings
Development
After checking out the repo, run:
bin/setup # Install dependencies
bundle exec rspec # Run testsAvailable Rake Tasks
# Testing
rake spec # Run all tests (default)
rake test:unit # Run unit tests only
rake test:integration # Run integration tests only
# Security
rake dev:security # Run security checks
rake dev:audit # Run bundler-audit
# Development
rake dev:test # Run all tests
rake dev:coverage # Run tests with coverage
rake release:prepare # Run full CI suite locallyContinuous Integration
This gem uses GitHub Actions for CI/CD with the following checks:
- Tests: Runs on Ruby 3.0, 3.1, 3.2, and 3.3
- Security: bundler-audit for dependency vulnerability scanning
- Build: Verifies gem can be built and installed
All checks must pass before merging pull requests.
To install this gem onto your local machine:
bundle exec rake installTo release a new version:
- Update the version number in
version.rb - Update
CHANGELOG.md - Run
bundle exec rake release:prepareto verify tests and security checks pass - Run
bundle exec rake release
Testing
The gem includes comprehensive test coverage with RSpec:
# Run all tests
bundle exec rspec
# Run specific test files
bundle exec rspec spec/elevenlabs_client/endpoints/
bundle exec rspec spec/elevenlabs_client/client
bundle exec rspec spec/integration/
# Run with documentation format
bundle exec rspec --format documentationContributing
Bug reports and pull requests are welcome on GitHub at https://github.com/yourusername/elevenlabs_client.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create a new Pull Request
License
The gem is available as open source under the terms of the MIT License.
Changelog
See CHANGELOG.md for a detailed list of changes and version history.
Support
- 📖 Documentation: API Documentation
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
Made with ❤️ for the Ruby community