Fight CSV!
It's 2011, and parsing CSV with Ruby still sucks? Enter FightCSV! It will take the cumbersome out of your CSV parsing, while keeping the awesome! Want some taste of that juicy fresh? Check out this example:
Consider you have a csv file called log_entries.csv which looks like this:
Date,Person,Client/Project,Minutes,Tags,Billable
2011-08-15,John Doe,handsomelabs,60,blogpost,no
2011-08-15,Max Powers,beerbrewing,60,meeting,yes
2011-08-15,Tyler Durden,babysitting,180,"concepting, research",yes
2011-08-15,Hulk Hero,gardening,60,"meeting, research",no
2011-08-15,John Doe,handsomelabs,60,coding,yes
2011-08-08,John Doe,handsomelabs,60,"blabla, meeting",yes
Schema
Now you can define a class representing a row of the file. You only need
to include FightCSV::Record.
class LogEntry
include FightCSV::Record
endBut of course you want the values from each row to behave like proper
Ruby objects. This can be easily achieved by defining a schema in the
LogEntry class:
class LogEntry
include FightCSV::Record
schema do
field "Name"
field "Client/Project", {
identifier: :project
}
end
endNow the LogEntry objects will have a name method corresponding to
the column called "Name" and a project method corresponding to the
column called "Client/Project".
But sometimes you don't only want to adjust the field names, but also the values. In this case FightCSV offers converters. The "Billable" column seems to represent boolean values, so let's tackle that:
class LogEntry
include FightCSV::Record
schema do
field "Name"
field "Client/Project", {
identifier: :project
}
field "Billable", {
converter: ->(string) { string == "yes" ? true : false }
}
end
endOften when converting something, we assume that it has a certain format.
The "Date" column for example should always be of the format
/\d{2}\.\d{2}\.\d{4}/. A validation can easily be added to a column
with FightCSV:
class LogEntry
include FightCSV::Record
schema do
field "Name"
field "Client/Project", {
identifier: :project
}
field "Billable", {
converter: ->(string) { string == "yes" ? true : false }
}
field "Date", {
validate: /\d{2}\.\d{2}\.\d{4}/,
converter: ->(string) { Date.parse(string) }
}
end
endThe complete schema:
class LogEntry
include FightCSV::Record
schema do
field "Name"
field "Client/Project", {
identifier: :project
}
field "Billable", {
converter: ->(string) { string == "yes" ? true : false }
}
field "Date", {
validate: /\d{2}\.\d{2}\.\d{4}/,
converter: ->(string) { Date.parse(string) }
}
field "Tags", {
converter: ->(string) { string.split(",") }
}
field "Minutes", {
validate: /\d+/,
converter: ->(string) { string.to_i }
}
end
endParsing CSV
With the schema definition you're finally able to parse some CSV. There are two possible ways of doing this:
-
LogEntry.recordswill return an array with all rows mapped to instances ofLogEntry. -
LogEntry.importwill return an enumerator which will pass the sameLogEntryinstance with the row changed for every iteration.LogEntry.import(csv).map(&:minutes).reduce(:+) #=> 780
Doing so you can avoid memory leaks on big csv documents.
CSV without a header
Sometimes you may want to parse csv without a header. Instead of defining how the column is called you can specify the number of the column counting from left as an argument to field.
Consider the following CSV:
Ruby,object oriented
Scheme,functional
Now you can define a ProgrammingLanguage class like this:
class ProgrammingLanguage
include FightCSV::Record
schema do
csv_options = { header: false }
field 1, identifier: :name
field 2, identifier: :main_paradigm
end
endSee the examples section for executable versions of these examples.
CSV format
Use the col_sep, row_sep and quote_char csv_options to costumize the csv format. Consider the following csv document:
Germany EUR/`United States` USD
You can costumize the csv format like so:
class Country
schema do
csv_options = { col_sep: " ", row_sep: "/", quote_char: "`" }
field 1, identifier: :name
field 2, identifier: :currency
end
endContributing to fight_csv
- Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
- Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
- Fork the project
- Commit and push until you are happy with your contribution
- Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
- Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
Compatibility
Only Ruby 1.9 Compatible
Author(s)
Copyright
Copyright (c) 2011 Railslove. See LICENSE.txt for further details.