Elevenlabs Ruby Gem
A Ruby client for the ElevenLabs Text-to-Speech API.
This gem provides an easy-to-use interface for:
- Listing available voices
- Fetching details about a voice
- Creating a custom voice (with uploaded sample files)
- Editing an existing voice
- Deleting a voice
- Converting text to speech and retrieving the generated audio
- Designing a voice based on a text description
- Streaming text-to-speech audio
- Music Generation
- Sound Effect Generation
All requests are handled via Faraday.
Table of Contents
- Features
- Installation
- Usage
- Basic Example
- Rails Integration
- Store API Key in Rails Credentials
- Rails Initializer
- Controller Example
- Endpoints
- Error Handling
- Development
- Contributing
- License
Features
- Simple and intuitive API client for ElevenLabs.
- Multipart file uploads for training custom voices.
- Voice design via text prompts to generate voice previews.
- Automatic authentication via API key configuration.
- Error handling with custom exceptions.
- Rails integration support (including credentials storage).
Installation
Add the gem to your Gemfile
:
gem "elevenlabs"
Then run:
bundle install
Or install it directly using:
gem install elevenlabs
Usage
Basic Example (Standalone Ruby)
require "elevenlabs"
# 1. Configure the gem globally (Optional)
Elevenlabs.configure do |config|
config.api_key = "YOUR_API_KEY"
end
# 2. Initialize a client (will use configured API key)
client = Elevenlabs::Client.new
# 3. List available voices
voices = client.list_voices
puts voices # JSON response with voices
# 4. Convert text to speech
voice_id = "YOUR_VOICE_ID"
text = "Hello from Elevenlabs!"
audio_data = client.text_to_speech(voice_id, text)
# 5. Save the audio file
File.open("output.mp3", "wb") { |f| f.write(audio_data) }
puts "Audio file saved to output.mp3"
# 6. Design a voice with a text prompt
response = client.design_voice(
"A deep, resonant male voice with a British accent, suitable for storytelling",
output_format: "mp3_44100_192",
model_id: "eleven_multilingual_ttv_v2",
text: "In a land far away, where the mountains meet the sky, a great adventure began. Brave heroes embarked on a quest to find the lost artifact, facing challenges and forging bonds that would last a lifetime. Their journey took them through enchanted forests, across raging rivers, and into the heart of ancient ruins.",
auto_generate_text: false,
loudness: 0.5,
seed: 12345,
guidance_scale: 5.0,
stream_previews: false
)
# 7. Save voice preview audio
require "base64"
response["previews"].each_with_index do |preview, index|
audio_data = Base64.decode64(preview["audio_base_64"])
File.open("preview_#{index}.mp3", "wb") { |f| f.write(audio_data) }
puts "Saved preview #{index + 1} to preview_#{index}.mp3"
end
Note: You can override the API key per request:
client = Elevenlabs::Client.new(api_key: "DIFFERENT_API_KEY")
Rails Integration
Store API Key in Rails Credentials
- Open your encrypted credentials:
EDITOR=vim rails credentials:edit
- Add the ElevenLabs API key:
eleven_labs:
api_key: YOUR_SECURE_KEY
- Save and exit. Rails will securely encrypt your API key.
Rails Initializer
Create an initializer file: config/initializers/elevenlabs.rb
# config/initializers/elevenlabs.rb
require "elevenlabs"
Rails.application.config.to_prepare do
Elevenlabs.configure do |config|
config.api_key = Rails.application.credentials.dig(:eleven_labs, :api_key)
end
end
Now you can simply call:
client = Elevenlabs::Client.new
without manually providing an API key.
Controller Example
class AudioController < ApplicationController
def generate
client = Elevenlabs::Client.new
voice_id = params[:voice_id]
text = params[:text]
begin
audio_data = client.text_to_speech(voice_id, text)
send_data audio_data, type: "audio/mpeg", disposition: "attachment", filename: "output.mp3"
rescue Elevenlabs::APIError => e
render json: { error: e.message }, status: :bad_request
end
end
end
Endpoints
- List Voices
client.list_voices
# => { "voices" => [...] }
2. List Models
client.list_models
# => [...]
3. **Get Voice Details**
```ruby
client.get_voice("VOICE_ID")
# => { "voice_id" => "...", "name" => "...", ... }
- Create a Custom Voice
sample_files = [File.open("sample1.mp3", "rb")]
client.create_voice("Custom Voice", sample_files, description: "My custom AI voice")
# => JSON response with new voice details
- Check if a Voice is Banned
sample_files = [File.open("trump.mp3", "rb")]
client.create_voice("Donald Trump", sample_files, description: "My Trump Voice")
# => {"voice_id"=>"<RETURNED_VOICE_ID>", "requires_verification"=>false}
trump = "<RETURNED_VOICE_ID>"
client.banned?(trump)
# => true
- Edit a Voice
client.edit_voice("VOICE_ID", name: "Updated Voice Name")
# => JSON response with updated details
- Delete a Voice
client.delete_voice("VOICE_ID")
# => JSON response acknowledging deletion
- Convert Text to Speech
audio_data = client.text_to_speech("VOICE_ID", "Hello world!")
File.open("output.mp3", "wb") { |f| f.write(audio_data) }
- Stream Text to Speech
Stream from terminal:
# Mac: Install sox
brew install sox
# Linux: Install sox
sudo apt install sox
IO.popen("play -t mp3 -", "wb") do |audio_pipe| # Notice "wb" (write binary)
client.text_to_speech_stream("VOICE_ID", "Some text to stream back in chunks") do |chunk|
audio_pipe.write(chunk.b) # Ensure chunk is written as binary
end
end
- Create a Voice from a Design
Once you’ve generated a voice design using client.design_voice, you can turn it into a permanent voice in your account by passing its generated_voice_id to client.create_from_generated_voice.
Step 1: Design a voice (returns previews + generated_voice_id)
design_response = client.design_voice(
"A warm, friendly female voice with a slight Australian accent",
model_id: "eleven_multilingual_ttv_v2",
text: "Welcome to our podcast, where every story is an adventure, taking you on a journey through fascinating worlds, inspiring voices, and unforgettable moments.",
auto_generate_text: false
)
generated_voice_id = design_response["previews"].first["generated_voice_id"] #three previews are given, but for this example we will use the first to create a voice here
# Step 2: Create the permanent voice
create_response = client.create_from_generated_voice(
"Friendly Aussie",
"A warm, friendly Australian-accented voice for podcasts",
generated_voice_id,
)
voice_id = create_response["voice_id"] # This is the ID you can use for TTS
# Step 3: Use the new voice for TTS
audio_data = client.text_to_speech(voice_id, "This is my new permanent designed voice.")
File.open("friendly_aussie.mp3", "wb") { |f| f.write(audio_data) }
Important notes:
Always store the returned voice_id from create_voice_from_design. This is the permanent identifier for TTS.
Designed voices cannot be used for TTS until they are created in your account.
If the voice is not immediately available for TTS, wait a few seconds or check its status via client.get_voice(voice_id) until it’s "active".
- Create a multi-speaker dialogue
inputs = [{text: "It smells like updog in here", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}, {text: "What's updog?", voice_id: "RILOU7YmBhvwJGDGjNmP"}, {text: "Not much, you?", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}]
audio_data = client.text_to_dialogue(inputs)
File.open("what's updog.mp3", "wb") { |f| f.write(audio_data) }
- Generate Music from prompt
audio = client.compose_music(prompt: "Lo-fi hip hop beat", music_length_ms: 30000)
File.binwrite("lofi.mp3", audio)
- Stream Music Generated from prompt
File.open("epic_stream.mp3", "wb") do |f|
client.compose_music_stream(prompt: "Epic orchestral build", music_length_ms: 60000) do |chunk|
f.write(chunk)
end
end
- Generate Music with Detailed Metadata (metadata + audio) from prompt
result = client.compose_music_detailed(prompt: "Jazz piano trio", music_length_ms: 20000)
puts result # raw multipart data (needs parsing)
- Create a music composition plan from prompt
plan = client.create_music_plan(prompt: "Upbeat pop song with verse and chorus", music_length_ms: 60000)
puts plan[:sections]
- Create sound effects from a prompt
Basic Usage: Simple Prompt Generate a sound effect with only a text prompt, using default settings (output_format: "mp3_44100_128", duration_seconds: nil (auto-detected), prompt_influence: 0.3).
audio_data = client.sound_generation("Futuristic laser blast in a space battle")
# Save the audio to a file
File.open("laser_blast.mp3", "wb") { |f| f.write(audio_data) }
Advanced Usage: Custom Duration, Influence, and Format Specify duration_seconds, prompt_influence, and output_format for precise control over the sound effect.
Generate a roaring dragon sound with specific settings
audio_data = client.sound_generation(
"Roaring dragon in a fantasy cave",
duration_seconds: 3.0,
prompt_influence: 0.7, # Higher influence for closer adherence to the prompt
output_format: "mp3_22050_32"
)
# Save the audio to a file
File.open("dragon_roar.mp3", "wb") { |f| f.write(audio_data) }
Looping Sound Effect Create a looping sound effect for continuous playback, such as background ambiance in a video game.
Generate a looping ambient sound for a haunted forest
audio_data = client.sound_generation(
"Eerie wind and distant owl hoots in a haunted forest",
loop: true,
duration_seconds: 10.0,
prompt_influence: 0.5,
output_format: "mp3_22050_32"
)
# Save the audio to a file
File.open("haunted_forest_loop.mp3", "wb") { |f| f.write(audio_data) }
For more details, see the ElevenLabs Sound Generation API documentation.
Error Handling
When the API returns an error, the gem raises specific exceptions:
Exception | Meaning |
---|---|
Elevenlabs::BadRequestError |
Invalid request parameters |
Elevenlabs::AuthenticationError |
Invalid API key |
Elevenlabs::NotFoundError |
Resource (voice) not found |
Elevenlabs::UnprocessableEntityError |
Unprocessable entity (e.g., invalid input format) |
Elevenlabs::APIError |
General API failure |
Example:
begin
client.design_voice("Short description") # Too short, will raise error
rescue Elevenlabs::UnprocessableEntityError => e
puts "Validation error: #{e.message}"
rescue Elevenlabs::AuthenticationError => e
puts "Invalid API key: #{e.message}"
rescue Elevenlabs::NotFoundError => e
puts "Voice not found: #{e.message}"
rescue Elevenlabs::APIError => e
puts "General error: #{e.message}"
end
Development
Clone this repository:
git clone https://github.com/your-username/elevenlabs.git
cd elevenlabs
Install dependencies:
bundle install
Build the gem:
gem build elevenlabs.gemspec
Install the gem locally:
gem install ./elevenlabs-0.0.8.gem
Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-new-feature
) - Commit your changes (
git commit -am 'Add new feature'
) - Push to your branch (
git push origin feature/my-new-feature
) - Create a Pull Request describing your changes
For bug reports, please open an issue with details.
License
This project is licensed under the MIT License. See the LICENSE file for details.
⭐ Thank you for using the Elevenlabs Ruby Gem!
If you have any questions or suggestions, feel free to open an issue or submit a Pull Request!