Project

houdah

0.0
No commit activity in last 3 years
No release in over 3 years
Ruby lib for interacting with a Hadoop JobTracker / TaskTrackers
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 1.4.4
>= 0.5.0
 Project Readme

houdah – Ruby lib for interacting with a Hadoop JobTracker / TaskTrackers¶ ↑

Houdah is a Ruby thrift lib for interfacing with a Hadoop jobtracker.

To install:

gem install houdah

Starting thrift server¶ ↑

Houdah expects that you have Cloudera’s Hue installed and working. As a part of the Hue installation, you will compile and install a thrift interface that runs when the jobtracker is running. To be clear, you do not have to run Hue in order for Houdah to work - you just need to make sure that the org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin is working.

Houdah has been tested with Hue 1.1.0. If you have it working with a later version of Hue, patches are welcome.

Basic Usage¶ ↑

require 'rubygems'
require 'houdah'

# args are host, optional port (9290 default), optional username (default "houdah"), and optional timeout
client = Houdah::Client.new "192.168.1.1"

# get list of running jobs
client.jobs

# get a list of failed jobs
client.jobs(:failed)

# get jobtracker name
client.name

# get percent mapped / reduced for all running jobs
client.jobs.each { |j| puts "#{j.percent_mapped} mapped #{j.percent_reduced} reduced" }

# get percent done for the most recent job
client.jobs.first.percent_done

# print out info about each tracker
client.trackers.each { |t| puts "#{t.host}: #{t.failureCount} failures, #{t.tasks.length} active tasks" }

# get number of active / blacklisted trackers
client.status.numActiveTrackers
client.status.numBlacklistedTrackers

# get number of excluded nodes
client.status.numExcludedNodes

# get the mapred.local.dir config variable for the most recent job
puts client.jobs.first.config['mapred.local.dir']

# get the hive query for the hive job running
puts client.jobs.first.config['hive.query.string']

# kill all jobs
client.jobs.each { |j| j.kill! }

# close the client connection
client.close

# Quick and dirty way to get the number of killed jobs.  The run class method takes care of closing the client.
puts Houdah::Client.run("192.168.1.1") { |h| h.jobs(:killed).length }

Bugs / Contact¶ ↑

Bugs and patches welcome at github.com/livingsocial/houdah.