The project is in a healthy, maintained state
Streaming CSV processor with row-by-row transforms, validations, column plucking, filtering, writing, error recovery, and automatic delimiter detection.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies
 Project Readme

philiprehberger-csv_kit

Tests Gem Version License

Streaming CSV processor with type coercion, validation, writing, and error recovery.

Requirements

  • Ruby >= 3.1

Installation

Add to your Gemfile:

gem 'philiprehberger-csv_kit'

Then run:

bundle install

Or install directly:

gem install philiprehberger-csv_kit

Usage

require 'philiprehberger/csv_kit'

Quick Load

rows = Philiprehberger::CsvKit.to_hashes('data.csv')
# => [{name: "Alice", age: "30"}, ...]

Pluck Columns

names = Philiprehberger::CsvKit.pluck('data.csv', :name, :city)
# => [{name: "Alice", city: "Berlin"}, ...]

Filter Rows

csv_string = Philiprehberger::CsvKit.filter('data.csv') do |row|
  row[:age].to_i >= 30
end

Streaming Processor

rows = Philiprehberger::CsvKit.process('data.csv') do |p|
  p.transform(:age) { |v| v.to_i }
  p.validate(:age) { |v| v.to_i.positive? }
  p.reject { |row| row[:city] == 'Unknown' }
  p.each { |row| puts row[:name] }
end

Writing CSV

writer = Philiprehberger::CsvKit::Writer.new(headers: [:name, :age])
csv_string = writer.write([{ name: "Alice", age: 30 }, { name: "Bob", age: 25 }])

# Write to a file
File.open('output.csv', 'w') do |f|
  writer.write_to([{ name: "Alice", age: 30 }], f)
end

Error Recovery

rows = Philiprehberger::CsvKit.process('data.csv') do |p|
  p.on_error { |row, err| :skip }  # or :abort
  p.transform(:age) { |v| Integer(v) }
end

Max Errors

processor = Philiprehberger::CsvKit::Processor.new('data.csv')
processor.max_errors(10)
processor.on_error { |row, err| :skip }
processor.transform(:age) { |v| Integer(v) }

begin
  processor.run
rescue Philiprehberger::CsvKit::Error
  puts processor.errors.length  # collected errors
end

Column Aliasing

rows = Philiprehberger::CsvKit.process('data.csv') do |p|
  p.rename(:raw_col, :clean_col)
end

Row Callbacks

rows = Philiprehberger::CsvKit.process('data.csv') do |p|
  p.after_each { |row| puts row.to_h }
end

Delimiter Detection

delimiter = Philiprehberger::CsvKit::Detector.detect('data.tsv')
# => "\t"

API

Method / Class Description
CsvKit.to_hashes(path) Load CSV into array of symbolized hashes
CsvKit.pluck(path, *keys) Extract specific columns
CsvKit.filter(path, &block) Filter rows, return CSV string
CsvKit.process(path_or_io, &block) Streaming DSL with transforms and validations
Processor#headers(*names) Override header names
Processor#transform(key, &block) Register column transform
Processor#validate(key, &block) Register column validation (skip invalid)
Processor#reject(&block) Reject rows matching predicate
Processor#each(&block) Callback for each processed row
Processor#on_error(&block) Per-row error handler (return :skip or :abort)
Processor#max_errors(n) Stop after N errors
Processor#errors Collected errors from last run
Processor#rename(from, to) Rename column during processing
Processor#after_each(&block) Callback after each row is fully processed
Writer.new(headers:) Create a CSV writer with given headers
Writer#write(rows) Generate CSV string from rows
Writer#write_to(rows, io) Write CSV to an IO object
Detector.detect(path_or_io) Auto-detect CSV delimiter
Row#[](key) Access value by symbol key
Row#to_h Convert row to plain hash

Development

bundle install
bundle exec rspec      # Run tests
bundle exec rubocop    # Check code style

License

MIT