Project

ganapati

0.01
No commit activity in last 3 years
No release in over 3 years
Hadoop HDFS Thrift interface for Ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

>= 0.10.0

Runtime

>= 0.5.0
 Project Readme

ganapati – Hadoop HDFS Thrift interface for Ruby¶ ↑

Note - this project has been abandoned by LivingSocial at this point in time. Feel free to fork if you wish to continue development. We have switched to using github.com/kzk/webhdfs for our projects.¶ ↑

Ganapati is a Ruby thrift lib for interfacing with Hadoop’s distributed file system, HDFS. It also includes a few command line client utilities.

To install:

gem install ganapati

Starting thrift server¶ ↑

Documentation in Hadoop for the thrift interface to HDFS is crap. It can be found here.

As a much simpler and safer way of auto compiling and then starting the thrift interface, use the provided script:

bin/hdfs_thrift_server <port>

This will start a thrift server on the given port (after compiling the server code provided in the Hadoop distribution).

Basic Usage¶ ↑

require 'rubygems'
require 'ganapati'

# args are host, port, and optional timeout
client = Ganapati::Client.new 'localhost', 1234

# copy a file to hdfs
client.put("/some/file", "/some/hadoop/path")

# get a file from hadoop
client.get("/some/hadoop/path", "/local/path")

# Create a file
f = client.create("/home/someuser/afile.txt")
f.write("this is some text")
# Always, always close the file
f.close 

# Create a file with code block
client.create("/home/someuser/afile.txt") { |f|
  f.write("this is some text")
}

# Open a file for reading and read it
client.open('/home/someuser/afile.txt') { |f| 
  puts f.read 
  # or read for specific length from start
  puts f.read(0, 4)
}

# read a file line by line
client.readlines('/home/someuser/afile.txt') { |line|
  puts line
}

# Open a file for appending and append to it
client.append('/home/someuser/afile.txt') { |f| 
  f.write "new data" 
}	  

## Common file methods are available (chown, chmod, mkdir, stat, etc).  Examples:
# move a file
client.mv "/home/someuser/afile.txt", "/home/someuser/test.txt"

# remove a file
client.rm "/home/someuser/test.txt"

# test for file existance
client.exists? "/home/someuser/test.txt"

# get a list of all files
client.ls "/home"

client.close

# Quick and dirty way to print remote file.  The run class method takes care of closing the client.
puts Ganapati::Client.run('localhost', 1234) { |c| c.open('/home/someuser/afile.txt') { |f| f.read } }

Command Line Utilities¶ ↑

There are a few utility programs included in the bin directory. hls provides a way to see the contents of HDFS (recursively and verbosely with appropriate command line options):

./bin/hls hdfs://host:port/tmp

hcp provides a way to copy to/from/between HDFS servers:

./bin/hcp hdfs://host:port/some/path/to/file ./file
./bin/hcp ./file hdfs://host:port/some/path/to/file
./bin/hcp hdfs://anotherhost:port/some/path/to/file hdfs://host:port/some/path/to/file