Project

att_speech

0.01
No commit activity in last 3 years
No release in over 3 years
A Ruby library for consuming v3 of the AT&T Speech API for speech->text, and text->speech. Takes in either .wav or specific other audio files, and returns a text string of the spoken words. Can also take in either a text string or .txt file and returns a string of bytes from which a .wav file can be created of the spoken text.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 1.0.0
>= 1.8.4
>= 3.12
>= 2.8.0
>= 0.7

Runtime

>= 0.11.1
~> 0.8.1
>= 1.2.0
 Project Readme

att_speech

Build Status

A Ruby library for consuming the AT&T Speech API for speech to text. API details may be found here.

Installation

gem install att_speech

Usage

require 'att_speech'

att_speech = ATTSpeech.new({ :api_key    => ENV['ATT_SPEECH_KEY'],
                             :secret_key => ENV['ATT_SPEECH_SECRET'],
                             :scope      => 'SPEECH' })

# Read the audio file contents
file_contents = File.read(File.expand_path(File.dirname(File.dirname(__FILE__))) + "/bostonSeltics.wav")

# Blocking operation
p att_speech.speech_to_text(file_contents, type='audio/wav')

# Non-blocking operation with a future, if you have a longer file that requires more processing time
sleep 2
future = att_speech.future(:speech_to_text, file_contents, type='audio/wav')
p future.value

# Non-blocking operation that will call a block when the transcrption is returned
# Note: Remember, this is a concurrent operation so don't pass self and avoid mutable objects in the block
# from the calling context, better to have discreet actions contained in the block, such as inserting in a
# datastore
sleep 2
supervisor = ATTSpeech.supervise({ :api_key    => ENV['ATT_SPEECH_KEY'],
                                   :secret_key => ENV['ATT_SPEECH_SECRET'],
                                   :scope      => 'SPEECH' })
supervisor.future.speech_to_text(file_contents)
# do other stuff here
sleep 5
transcription = supervisor.value # returns immediately if the operation is complete, otherwise blocks until the value is ready

def write_wav_file(audio_bytes)
  file_name = "ret_audio-#{Time.now.strftime('%Y%m%d-%H%M%S')}.wav"
  full_file_name = File.expand_path(File.join(File.dirname(File.dirname(__FILE__)), 'examples', file_name))
  audio_file = File.open(full_file_name, "w")
  audio_file << audio_bytes
  audio_file.close
end

att_text = ATTSpeech.new({ :api_key    => ENV['ATT_SPEECH_KEY'],
                           :secret_key => ENV['ATT_SPEECH_SECRET'],
                           :scope      => 'TTS' })

# Read the text file contents
tfp = File.expand_path(File.join(File.dirname(File.dirname(__FILE__)), 'examples', 'helloWorld.txt'))
txt_contents = File.read(tfp)

audio = att_text.text_to_speech(txt_contents)
write_wav_file(audio)

# Non-blocking operation with a future, if you have a longer file that requires more processing time
sleep 2
future = att_text.future(:text_to_speech, "This is a hello world.", type='text/plain')
write_wav_file(future.value)

Copyright

Copyright (c) 2013 Jason Goecke. Copyright (c) 2014 Ben Klang. See LICENSE.txt for further details.