Project

rake_hdfs

0.0
No commit activity in last 3 years
No release in over 3 years
some rake dsl for hadoop
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.9
~> 10.0

Runtime

~> 0.8.0
 Project Readme

RakeHdfs

this gem makes rake run on hdfs file system. it's based on webhdfs.

Installation

Add this line to your application's Gemfile:

gem 'rake_hdfs'

And then execute:

$ bundle

Or install it yourself as:

$ gem install rake_hdfs

Usage

require "webhdfs/fileutils"
require "rake_hdfs"
$dir = "/user/chenkovsky.chen"

WebHDFS::FileUtils.set_server("localhost", 50070, "chenkovsky", nil)

hdirectory "#{$dir}/tmp_dir"
desc "test hdfs rake"
hfile "#{$dir}/tmp_dir/tmp.txt" => ["#{$dir}/tmp_dir"] do
  raise "tmp file should not exist." if hexist? "#{$dir}/tmp_dir/tmp.txt"
  files = hls $dir
  puts files
  dir_mtime = hmtime $dir
  puts dir_mtime
  hcopy_from_local "tmp.txt", "#{$dir}/tmp_dir/tmp.txt"
  hcopy_from_local_via_stream "tmp.txt", "#{$dir}/tmp_dir/tmp2.txt"

  hcopy_to_local "#{$dir}/tmp_dir/tmp.txt", "tmp3.txt"

  happend("#{$dir}/tmp_dir/tmp2.txt", "hahaha")

  hmkdir "#{$dir}/tmp2_dir"

  raise "tmp2_dir should not exist." if not hexist? "#{$dir}/tmp2_dir"

  hrm "#{$dir}/tmp2_dir"
  hmkdir "#{$dir}/tmp3_dir"
  hcopy_from_local "tmp.txt", "#{$dir}/tmp3_dir/tmp.txt"

  hrmr "#{$dir}/tmp3_dir"

  hrename "#{$dir}/tmp_dir/tmp2.txt", "#{$dir}/tmp_dir/tmp4.txt"

  hchmod 0755, "#{$dir}/tmp_dir/tmp4.txt"

  puts (hstat "#{$dir}/tmp_dir/tmp4.txt")
  raise "not correct uptodate" unless huptodate? "#{$dir}/tmp_dir/tmp4.txt", ["#{$dir}/tmp_dir/tmp.txt"]
end

Contributing

  1. Fork it ( https://github.com/chenkovsky/rake_hdfs/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request