0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Simple command line utility to pick a random sample of lines out of the given file(s)
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 1.3
>= 0
 Project Readme

samplelines


Deprecated and archived. Turns out there's a standard shuf command in unix. Now I feel dumb.


A simple command line tool to take samples lines from a file or set of files.


samplelines [options] [filename(s) and/or STDIN and/or STDERR]
    -v, --version      print version info
    -h, --help         print usage information
    -p, --percent      set % chance of a line being output [trumps use of -1]
    -1, --one-in       compute odds of outputing a line as 1 out of every N
    -m, --max          return at most this many lines
    -t, --time         stop running after N seconds
    

Either -p or -1 is required. Filenames (or the special strings STDIN and STDERR) will be processed in the order you specify them.

As a convenience, samplelines will automatically deal with gzipped files that end in .gz

Note that there is no guarantee of output. If you give a low percentage and a small file, there's a chance nothing will be randomly chosen.

Examples

# Pick about 10% of the lines from bigfile.txt
samplelines -p 10 bigfile.txt

# Ditto, but get a maximum of 100 lines
samplelines -p 10 -m 100 bigfile.txt

# This time, get about one out of every 500 lines
samplelines -1 500 -m 100 bigfile.txt

# Get about 2% of the lines from a set of gzipped files
samplelines -p 2 *.txt.gz

# Ditto, but don't get more than 10_000 lines or
#  run for more than five seconds
samplelines -p 2 -t 5 -m 10000 *.txt.gz

Installation

Add this line to your application's Gemfile:

gem 'samplelines'

And then execute:

$ bundle

Or install it yourself as:

$ gem install samplelines

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request