Elevenlabs Ruby Gem

A Ruby client for the ElevenLabs Text-to-Speech API.
This gem provides an easy-to-use interface for:

Listing available voices
Fetching details about a voice
Creating a custom voice (with uploaded sample files)
Editing an existing voice
Deleting a voice
Converting text to speech and retrieving the generated audio
Designing a voice based on a text description
Streaming text-to-speech audio
Music Generation
Sound Effect Generation

All requests are handled via Faraday.

Features
Installation
Usage
- Basic Example
- Rails Integration
  - Store API Key in Rails Credentials
  - Rails Initializer
  - Controller Example
Endpoints
Error Handling
Development
Contributing
License

Features

Simple and intuitive API client for ElevenLabs.
Multipart file uploads for training custom voices.
Voice design via text prompts to generate voice previews.
Automatic authentication via API key configuration.
Error handling with custom exceptions.
Rails integration support (including credentials storage).

Installation

Add the gem to your Gemfile:

gem "elevenlabs"

Then run:

bundle install

Or install it directly using:

gem install elevenlabs

Usage

Basic Example (Standalone Ruby)

require "elevenlabs"

# 1. Configure the gem globally (Optional)
Elevenlabs.configure do |config|
  config.api_key = "YOUR_API_KEY"
end

# 2. Initialize a client (will use configured API key)
client = Elevenlabs::Client.new

# 3. List available voices
voices = client.list_voices
puts voices # JSON response with voices

# 4. Convert text to speech
voice_id = "YOUR_VOICE_ID"
text = "Hello from Elevenlabs!"
audio_data = client.text_to_speech(voice_id, text)

# 5. Save the audio file
File.open("output.mp3", "wb") { |f| f.write(audio_data) }
puts "Audio file saved to output.mp3"

# 6. Design a voice with a text prompt
response = client.design_voice(
  "A deep, resonant male voice with a British accent, suitable for storytelling",
  output_format: "mp3_44100_192",
  model_id: "eleven_multilingual_ttv_v2",
  text: "In a land far away, where the mountains meet the sky, a great adventure began. Brave heroes embarked on a quest to find the lost artifact, facing challenges and forging bonds that would last a lifetime. Their journey took them through enchanted forests, across raging rivers, and into the heart of ancient ruins.",
  auto_generate_text: false,
  loudness: 0.5,
  seed: 12345,
  guidance_scale: 5.0,
  stream_previews: false
)

# 7. Save voice preview audio
require "base64"
response["previews"].each_with_index do |preview, index|
  audio_data = Base64.decode64(preview["audio_base_64"])
  File.open("preview_#{index}.mp3", "wb") { |f| f.write(audio_data) }
  puts "Saved preview #{index + 1} to preview_#{index}.mp3"
end

Note: You can override the API key per request:

client = Elevenlabs::Client.new(api_key: "DIFFERENT_API_KEY")

Rails Integration

Store API Key in Rails Credentials

Open your encrypted credentials:

EDITOR=vim rails credentials:edit

Add the ElevenLabs API key:

eleven_labs:
  api_key: YOUR_SECURE_KEY

Save and exit. Rails will securely encrypt your API key.

Rails Initializer

Create an initializer file: config/initializers/elevenlabs.rb

# config/initializers/elevenlabs.rb
require "elevenlabs"

Rails.application.config.to_prepare do
  Elevenlabs.configure do |config|
    config.api_key = Rails.application.credentials.dig(:eleven_labs, :api_key)
  end
end

Now you can simply call:

client = Elevenlabs::Client.new

without manually providing an API key.

Controller Example

class AudioController < ApplicationController
  def generate
    client = Elevenlabs::Client.new
    voice_id = params[:voice_id]
    text = params[:text]

    begin
      audio_data = client.text_to_speech(voice_id, text)
      send_data audio_data, type: "audio/mpeg", disposition: "attachment", filename: "output.mp3"
    rescue Elevenlabs::APIError => e
      render json: { error: e.message }, status: :bad_request
    end
  end
end

Endpoints

List Voices

client.list_voices
# => { "voices" => [...] }

2. List Models

client.list_models
# => [...]

3. **Get Voice Details**

```ruby
client.get_voice("VOICE_ID")
# => { "voice_id" => "...", "name" => "...", ... }

Create a Custom Voice

sample_files = [File.open("sample1.mp3", "rb")]
client.create_voice("Custom Voice", sample_files, description: "My custom AI voice")
# => JSON response with new voice details

Check if a Voice is Banned

sample_files = [File.open("trump.mp3", "rb")]
client.create_voice("Donald Trump", sample_files, description: "My Trump Voice")
# => {"voice_id"=>"<RETURNED_VOICE_ID>", "requires_verification"=>false}
trump = "<RETURNED_VOICE_ID>"
client.banned?(trump)
# => true

Edit a Voice

client.edit_voice("VOICE_ID", name: "Updated Voice Name")
# => JSON response with updated details

Delete a Voice

client.delete_voice("VOICE_ID")
# => JSON response acknowledging deletion

Convert Text to Speech

audio_data = client.text_to_speech("VOICE_ID", "Hello world!")
File.open("output.mp3", "wb") { |f| f.write(audio_data) }

Stream Text to Speech

Stream from terminal:

# Mac: Install sox
brew install sox
# Linux: Install sox
sudo apt install sox

IO.popen("play -t mp3 -", "wb") do |audio_pipe| # Notice "wb" (write binary)
  client.text_to_speech_stream("VOICE_ID", "Some text to stream back in chunks") do |chunk|
    audio_pipe.write(chunk.b) # Ensure chunk is written as binary
  end
end

Create a Voice from a Design

Once you’ve generated a voice design using client.design_voice, you can turn it into a permanent voice in your account by passing its generated_voice_id to client.create_from_generated_voice.

Step 1: Design a voice (returns previews + generated_voice_id)

design_response = client.design_voice(
  "A warm, friendly female voice with a slight Australian accent",
  model_id: "eleven_multilingual_ttv_v2",
  text: "Welcome to our podcast, where every story is an adventure, taking you on a journey through fascinating worlds, inspiring voices, and unforgettable moments.",
  auto_generate_text: false
)

generated_voice_id = design_response["previews"].first["generated_voice_id"] #three previews are given, but for this example we will use the first to create a voice here

# Step 2: Create the permanent voice
create_response = client.create_from_generated_voice(
  "Friendly Aussie",
  "A warm, friendly Australian-accented voice for podcasts",
   generated_voice_id,
)

voice_id = create_response["voice_id"] # This is the ID you can use for TTS

# Step 3: Use the new voice for TTS
audio_data = client.text_to_speech(voice_id, "This is my new permanent designed voice.")
File.open("friendly_aussie.mp3", "wb") { |f| f.write(audio_data) }

Important notes:

Always store the returned voice_id from create_voice_from_design. This is the permanent identifier for TTS.

Designed voices cannot be used for TTS until they are created in your account.

If the voice is not immediately available for TTS, wait a few seconds or check its status via client.get_voice(voice_id) until it’s "active".

Create a multi-speaker dialogue

inputs = [{text: "It smells like updog in here", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}, {text: "What's updog?", voice_id: "RILOU7YmBhvwJGDGjNmP"}, {text: "Not much, you?", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}]

audio_data = client.text_to_dialogue(inputs)
File.open("what's updog.mp3", "wb") { |f| f.write(audio_data) }

Generate Music from prompt

audio = client.compose_music(prompt: "Lo-fi hip hop beat", music_length_ms: 30000)
File.binwrite("lofi.mp3", audio)

Stream Music Generated from prompt

File.open("epic_stream.mp3", "wb") do |f|
  client.compose_music_stream(prompt: "Epic orchestral build", music_length_ms: 60000) do |chunk|
    f.write(chunk)
  end
end

Generate Music with Detailed Metadata (metadata + audio) from prompt

result = client.compose_music_detailed(prompt: "Jazz piano trio", music_length_ms: 20000)
puts result # raw multipart data (needs parsing)

Create a music composition plan from prompt

plan = client.create_music_plan(prompt: "Upbeat pop song with verse and chorus", music_length_ms: 60000)
puts plan[:sections]

Create sound effects from a prompt

Basic Usage: Simple Prompt Generate a sound effect with only a text prompt, using default settings (output_format: "mp3_44100_128", duration_seconds: nil (auto-detected), prompt_influence: 0.3).

audio_data = client.sound_generation("Futuristic laser blast in a space battle")

# Save the audio to a file
File.open("laser_blast.mp3", "wb") { |f| f.write(audio_data) }

Advanced Usage: Custom Duration, Influence, and Format Specify duration_seconds, prompt_influence, and output_format for precise control over the sound effect.

Generate a roaring dragon sound with specific settings

audio_data = client.sound_generation(
  "Roaring dragon in a fantasy cave",
  duration_seconds: 3.0,
  prompt_influence: 0.7, # Higher influence for closer adherence to the prompt
  output_format: "mp3_22050_32"
)

# Save the audio to a file
File.open("dragon_roar.mp3", "wb") { |f| f.write(audio_data) }

Looping Sound Effect Create a looping sound effect for continuous playback, such as background ambiance in a video game.

Generate a looping ambient sound for a haunted forest

audio_data = client.sound_generation(
  "Eerie wind and distant owl hoots in a haunted forest",
  loop: true,
  duration_seconds: 10.0,
  prompt_influence: 0.5,
  output_format: "mp3_22050_32"
)
# Save the audio to a file
File.open("haunted_forest_loop.mp3", "wb") { |f| f.write(audio_data) }

For more details, see the ElevenLabs Sound Generation API documentation.

Error Handling

When the API returns an error, the gem raises specific exceptions:

Exception	Meaning
`Elevenlabs::BadRequestError`	Invalid request parameters
`Elevenlabs::AuthenticationError`	Invalid API key
`Elevenlabs::NotFoundError`	Resource (voice) not found
`Elevenlabs::UnprocessableEntityError`	Unprocessable entity (e.g., invalid input format)
`Elevenlabs::APIError`	General API failure

Example:

begin
  client.design_voice("Short description") # Too short, will raise error
rescue Elevenlabs::UnprocessableEntityError => e
  puts "Validation error: #{e.message}"
rescue Elevenlabs::AuthenticationError => e
  puts "Invalid API key: #{e.message}"
rescue Elevenlabs::NotFoundError => e
  puts "Voice not found: #{e.message}"
rescue Elevenlabs::APIError => e
  puts "General error: #{e.message}"
end

Development

Clone this repository:

git clone https://github.com/your-username/elevenlabs.git
cd elevenlabs

Install dependencies:

bundle install

Build the gem:

gem build elevenlabs.gemspec

Install the gem locally:

gem install ./elevenlabs-0.0.8.gem

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/my-new-feature)
Commit your changes (git commit -am 'Add new feature')
Push to your branch (git push origin feature/my-new-feature)
Create a Pull Request describing your changes

For bug reports, please open an issue with details.

License

This project is licensed under the MIT License. See the LICENSE file for details.

⭐ Thank you for using the Elevenlabs Ruby Gem!
If you have any questions or suggestions, feel free to open an issue or submit a Pull Request!

elevenlabs

Runtime