0.0
The project is in a healthy, maintained state
Extends Roo with SmarterCSV integration for robust and faster CSV parsing
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

>= 1.7
>= 0
>= 5.19.0
>= 10.0
>= 3.0

Runtime

>= 2.0.0, < 4
>= 1.15.0
 Project Readme

roo-smarter_csv

Gem Version codecov RubyGems Ruby Toolbox

roo-smarter_csv replaces Roo's CSV backend with SmarterCSV while keeping the Roo spreadsheet API.

What it does

  • Uses SmarterCSV for parsing CSV input
  • Uses SmarterCSV defaults unless overridden by Roo compatibility behavior or explicit options

SmarterCSV Benefits

Performance

Speedup vs Roo::CSV with SmarterCSV 1.17.1

File Speedup
PEOPLE_IMPORT_B.csv 2.98x
uscities.csv 4.22x
uszips.csv 4.45x
worldcities.csv 4.58x
embedded_newlines_60k.csv 3.84x
heavy_quoting_60k.csv 3.42x
many_empty_fields_60k.csv 3.36x
sample_100k.csv 3.17x
sensor_data_50krows_50cols.csv 3.23x
tab_separated_60k.tsv 3.14x
utf8_multibyte_60k.csv 3.17x

Roo API

  • Keeps Roo's spreadsheet-style API:
    • cell
    • celltype
    • row
    • column
    • each
    • parse
    • first_row / last_row
    • first_column / last_column
  • Preserves Roo's single-sheet CSV behavior
  • Supports Roo's Roo::Spreadsheet.open(...) entry point
  • Supports CSV export through Roo's existing to_csv

Installation

Add to your Gemfile:

gem "roo-smarter_csv"

Then run:

bundle install

Activation

require "roo-smarter_csv"

spreadsheet = Roo::Spreadsheet.open("data.csv")

require "roo-smarter_csv" automatically loads both roo and smarter_csv and registers Roo::SmarterCSV as Roo's CSV handler.

Supported behavior

roo-smarter_csv reads the full CSV input and exposes it through Roo's spreadsheet abstraction.

It supports:

  • local files
  • StringIO / stream input
  • Roo's Roo::Spreadsheet.open(...)
  • CSV files with a UTF-8 BOM
  • tab-delimited input via col_sep: "\t"
  • SmarterCSV type conversion
  • warnings emitted by SmarterCSV
  • Roo's to_csv export for the in-memory spreadsheet representation

Architecture note

SmarterCSV is used as the parser, but Roo remains the public model.

That means:

  • SmarterCSV row hashes are an internal parsing representation
  • Roo still stores data in its coordinate-based cell grid
  • Roo's public API remains spreadsheet-like
  • hash-based rows are only an intermediate step for parser-to-grid conversion

Options

  • SmarterCSV options are handled as nested options, e.g. options = { smarter_csv: {} }
  • roo-smarter_csv defaults the SmarterCSV option remove_empty_hashes to false, so that it is compatible with Roo.
  • roo-smarter_csv honors some of the csv_options from Roo, but we encourage that you pass those under smarter_csv options.

Option precedence

roo-smarter_csv understands two option namespaces:

1. SmarterCSV options

Primary namespace:

smarter_csv: {
  col_sep: ";",
  row_sep: "\n",
  quote_char: '"',
  encoding: "utf-8"
}

2. Roo compatibility options

Roo already uses:

csv_options: {
  col_sep: ";",
  row_sep: "\n",
  quote_char: '"',
  encoding: "utf-8"
}

Only these four keys are copied from csv_options into the effective SmarterCSV options:

  • col_sep
  • row_sep
  • quote_char
  • encoding

Precedence rules

  1. Start with SmarterCSV defaults.
  2. Apply roo-smarter_csv compatibility overrides.
  3. Copy supported keys from csv_options into the SmarterCSV options.
  4. Apply smarter_csv on top.
  5. If the same key exists in both places, smarter_csv wins.
  6. Conflicts emit a warning.

Only the following Roo-compatible CSV keys are bridged from csv_options:

  • col_sep
  • row_sep
  • quote_char
  • encoding

No other Roo options are treated as CSV parser settings.

Examples

Only Roo options

Roo::Spreadsheet.open("data.tsv", csv_options: { col_sep: "\t" })

Only SmarterCSV options

Roo::Spreadsheet.open("data.csv", smarter_csv: { col_sep: ";" })

Both, with conflict

Roo::Spreadsheet.open(
  "data.csv",
  csv_options: { col_sep: ";" },
  smarter_csv: { col_sep: "\t" }
)

In this case, smarter_csv[:col_sep] wins and a warning is emitted.

SmarterCSV defaults

When you do not pass any options, roo-smarter_csv starts from SmarterCSV defaults and then applies one compatibility override for Roo:

  • remove_empty_hashes: false

That override is intentional. Roo expects blank rows to remain addressable in the spreadsheet model, so roo-smarter_csv disables SmarterCSV's default behavior of dropping fully empty row hashes.

Some important effective defaults are therefore:

  • col_sep: :auto — auto-detects the separator
  • row_sep: :auto — auto-detects line endings
  • quote_char: '"'
  • downcase_header: true
  • strings_as_keys: false
  • convert_values_to_numeric: true
  • remove_empty_hashes: falseroo-smarter_csv sets this for Roo compatibility so blank rows remain addressable through the spreadsheet API.
  • headers_in_file: true

This means common CSV files work without extra configuration, and SmarterCSV can infer separators and convert numeric values automatically while still preserving Roo-compatible blank rows.

Default behavior examples

Auto-detected separator

spreadsheet = Roo::Spreadsheet.open("data.csv")

No col_sep is needed for normal comma-separated CSV files.

Automatic numeric conversion

spreadsheet.cell(2, 2)   # => 30
spreadsheet.cell(2, 4)   # => 1.5

Headers and keys

SmarterCSV downcases headers by default and returns symbol keys:

SmarterCSV.process(StringIO.new("Name,Email\nJohn,john@example.com\n")).first
# => { name: "John", email: "john@example.com" }

If you want string keys instead, SmarterCSV supports:

SmarterCSV.process(
  StringIO.new("Name,Email\nJohn,john@example.com\n"),
  strings_as_keys: true
).first
# => { "name" => "John", "email" => "john@example.com" }

In roo-smarter_csv, those row hashes are used internally to populate Roo's spreadsheet grid. The public Roo methods still behave like spreadsheet methods.

Examples

Basic Roo usage

require "roo"
require "roo-smarter_csv"

csv = Roo::Spreadsheet.open("people.csv")

csv.cell(2, 1)      # => "John"
csv.cell(2, 2)      # => 30
csv.row(2)          # => ["John", 30, "john@example.com", 50000]
csv.first_row       # => 1
csv.last_row        # => 4

TSV example

csv = Roo::Spreadsheet.open(
  "people.tsv",
  extension: :csv,
  csv_options: { col_sep: "\t" }
)

Explicit SmarterCSV options

csv = Roo::Spreadsheet.open(
  "data.csv",
  smarter_csv: {
    col_sep: ";",
    quote_char: '"'
  }
)

Development

bundle install
bundle exec rspec

Reporting Bugs / Feature Requests

Please open an Issue on GitHub if you have feedback, new feature requests, or want to report a bug. Thank you!

For reporting issues, please:

  • include a small sample CSV file
  • open a pull-request adding a test that demonstrates the issue
  • mention your version of SmarterCSV, Ruby, Rails

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

MIT