rb-edge-tts
Microsoft Edge Online Text-to-Speech Service (Ruby Gem).
This is a Ruby implementation of the Microsoft Edge Online Text-to-Speech (TTS) service. It provides a simple and easy-to-use interface, allowing you to use high-quality Edge TTS voices directly within your Ruby applications or via the command line.
This project is a port of the Python project edge-tts.
Features
- High-Quality Voices: Direct access to Microsoft Edge's online neural voices.
- Multi-Language Support: Supports 100+ languages and 400+ voices.
- Highly Customizable: Adjustable rate, volume, and pitch.
- Subtitle Generation: Supports generating subtitles in SRT format.
-
Command-Line Tools: Includes
rb-edge-ttsandrb-edge-playbackCLI tools. - Lightweight: Removed unnecessary async dependencies, using standard Ruby libraries and efficient WebSocket handling.
Installation
Add this line to your application's Gemfile:
gem 'rb-edge-tts'And then execute:
$ bundle installOr install it yourself as:
$ gem install rb-edge-ttsQuick Start
Command Line Usage
rb-edge-tts provides a powerful command-line interface.
Basic Usage: Generate MP3
$ rb-edge-tts --text 'Hello, world!' --write-media hello.mp3Generate Audio and Subtitles
$ rb-edge-tts --text 'Hello, world!' --write-media hello.mp3 --write-subtitles hello.srtUse a Specific Voice
$ rb-edge-tts --text 'Hello, world!' --voice en-GB-SoniaNeural --write-media hello.mp3
$ rb-edge-tts -f test.txt --write-media test.mp3 -v de-DE-AmalaNeuralAdjust Parameters (Rate, Volume, Pitch)
$ rb-edge-tts --rate=+20% --volume=+10% --pitch=+5Hz --text 'Hello, world!' --write-media output.mp3Instant Playback
Use the rb-edge-playback command to play the generated speech immediately (requires mpv installed):
$ rb-edge-playback --text 'Hello, world!'List All Available Voices
$ rb-edge-tts --list-voicesRuby Library Usage
Basic Example
require 'rb_edge_tts'
# Use default voice
communicate = RbEdgeTTS::Communicate.new('Hello, world!')
communicate.save("output.mp3")Advanced Example: Custom Parameters
require 'rb_edge_tts'
communicate = RbEdgeTTS::Communicate.new(
'Hello, world!',
"en-US-AriaNeural",
rate: "+10%", # Speed
volume: "+20%", # Volume
pitch: "+5Hz" # Pitch
)
communicate.save("output.mp3")Generating Subtitles
require 'rb_edge_tts'
communicate = RbEdgeTTS::Communicate.new('Hello, world!')
submaker = RbEdgeTTS::SubMaker.new
File.open("output.mp3", "wb") do |file|
communicate.stream do |chunk|
if chunk.type == "audio"
file.write(chunk.data)
elsif %w[WordBoundary SentenceBoundary].include?(chunk.type)
submaker.feed(chunk)
end
end
end
File.write("output.srt", submaker.to_srt)Streaming
require 'rb_edge_tts'
communicate = RbEdgeTTS::Communicate.new('Hello, world!')
communicate.stream do |chunk|
if chunk.type == "audio"
# Process audio data chunk (chunk.data)
print "."
end
endVoice Management
You can use VoicesManager to find and filter available voices.
require 'rb_edge_tts'
# Get all available voices
voices = RbEdgeTTS::VoicesManager.create
# Find all Chinese (Simplified) Female voices
chinese_female_voices = voices.find(locale: "zh-CN", gender: "Female")
chinese_female_voices.each do |voice|
puts "#{voice.short_name}: #{voice.friendly_name}"
endDevelopment Guide
This section is for developers who want to contribute to rb-edge-tts or build from source.
Requirements
- Ruby 3.0 or higher
Local Setup
-
Clone the repository
git clone https://github.com/ZPVIP/rb-edge-tts.git cd rb-edge-tts -
Install dependencies
bundle install
-
Run tests
We use
rspecfor testing.bundle exec rspec
Local Build and Install
If you modified the code and want to test it locally:
-
Build the Gem
gem build rb-edge-tts.gemspec
This will generate an
rb-edge-tts-<version>.gemfile. -
Install the Gem
gem install ./rb-edge-tts-<version>.gem
-
Verify Installation
rb-edge-tts --version
Publishing
To publish a new version to RubyGems (requires permissions):
-
Update the version number in
lib/rb_edge_tts/version.rb. -
Update
CHANGELOG.md. -
Build and push:
gem build rb-edge-tts.gemspec gem push rb-edge-tts-<version>.gem
Dependencies
-
eventmachine: For WebSocket event loop. -
faye-websocket: WebSocket client implementation. -
json: JSON data processing. -
terminal-table: For formatting CLI output. -
net/http: For fetching voice lists (Standard Library).
License
This project is licensed under the GNU Lesser General Public License v3.0 (LGPLv3).
The lib/rb_edge_tts/srt_composer.rb file is licensed under the MIT License.
Acknowledgments
This project is a Ruby port of the Python project edge-tts. Thanks to rany2 for developing the original version, making high-quality TTS conversion possible via the Edge interface.
For questions regarding Python implementation details, please refer to the original project.