Speaky CSV
CSV imports and exports for ActiveRecord.
Speaky CSV features:
- An easy to use API,
- Speedy stream processing using enumerators.
Installation
Add this line to your application's Gemfile:
gem 'speaky_csv'
And then execute:
$ bundle
Or install it yourself as:
$ gem install speaky_csv
Usage
Let's say you build software for your local public library and there exists a Book model:
# in app/models/book.rb
class Book < ActiveRecord::Base
# ...
endSpeaky can be used to import and export book records using csv files. The definition of the csv format could look like this:
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author
end
endThis defines a CSV format that looks like this:
id,author
3,Stevenson
19,Melville
1,Macaulay
Exporting
Creating a csv file from records in a database can be done with the exporter:
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author, :_destroy
end
end
books = [
Book.create!(author: 'Stevenson'),
Book.create!(author: 'Melville'),
Book.create!(author: 'Macaulay'),
]
exporter = BookCsv.exporter books
io = StringIO.new
exporter.each { |row| io.write row }io will have the following contents:
id,author,_destroy
2,Stevenson,false
3,Melville,false
4,Macaulay,false
With Associations
Associations can also be exported.
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author
config.belongs_to :publisher do |p|
p.field :id, :name
end
config.has_many :reviews do |r|
r.field :id, :tomatoes, :publication
end
end
endThis defines a CSV format that looks like this:
id,author,publisher_id,publisher_name
3,Stevenson,22,Blam Ltd
19,Melville,,
1,Macaulay,83,NY Tiempo,reviews_0_id,8,reviews_0_tomatoes,50,review_0_publication,Daily
Since a book only ever has one publisher, these can get dedicated
columns with headers (the publisher_id and publisher_name columns).
Reviews are more tricky because there can be several that need to be
serialized to a single csv row. Speaky CSV uses a convention similar to
how rails and rack deal with query parameters for things like multi
select form inputs.
Importing
Now lets import some books. Speaky will expect an import to have an initial header row, and each subsequent row to represent a user record. Columns can be in any order.
Let's create a book by importing a csv.
csv_io = StringIO.new <<-CSV
id,author
,Sneed
CSVNotice the empty id column. This tells Speaky that the operation is a create. The file can be imported like this:
importer = BookCsv.active_record_importer csv_io, Book
importer.each { |book| book.save }
Book.last.author == 'Sneed' # => trueThis importer is an active record importer, which means that #each
will return active record instances. There is also an attribute importer
that will return hashes of attribute name => values. See the rdoc for
more info on that.
Update
Let's change the author value:
csv_io = StringIO.new <<-CSV
id,author
1,Simon Sneed
CSVNow there is an id value in the csv. Having an id value will cause Speaky to find the record with the given id and update it.
importer = BookCsv.active_record_importer csv_io, Book
importer.each { |book| book.save }
expect(Book.last.author).to eq 'Simon Sneed'If a record with the given id isn't found, the importer will return a nil for that row instead of an active record and add a message a log file:
csv_io = StringIO.new <<-CSV
id,author
234,I dont exist
CSV
importer = BookCsv.active_record_importer csv_io, Book
importer.to_a # => [nil]
importer.log # => '...[row 1] record not found with primary key: "234"....'For more info on the log file see below.
Destroy
To destroy the record, we'll need to change the csv format to add a
_destroy field. If this column contains a true value like: 'true' or
'1', the record will be marked for destruction.
Marking an active record for destruction is documented here: http://api.rubyonrails.org/v4.2.0/classes/ActiveRecord/AutosaveAssociation.html#method-i-marked_for_destruction-3F
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author, :_destroy
end
end
csv_io = StringIO.new <<-CSV
id,_destroy
1,true
CSV
importer = BookCsv.active_record_importer csv_io, Book
book = importer.to_a.first
if book.marked_for_destruction?
book.destroy
endWith Associations
Speaky uses the active record accepts_nested_attributes_for feature to
deal with importing association data.
For example, if a belongs_to association is configured:
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author
config.belongs_to :publisher do |p|
p.field :id, :name
end
end
endAnd the csv file being imported is this:
id,author,publisher_id,publisher_name
3,Stevenson,22,Blam Ltd
Then speaky will find a Booking record with id 3 and call:
booking.publisher_attributes = {id: '22', name: 'Blam Ltd'}For a has_many association, if the configuration looked like this:
# in app/csvs/book_csv.rb
class BookCsv < SpeakyCsv::Base
define_csv_fields do |config|
config.field :id, :author
config.has_many :reviews do |r|
r.field :id, :tomatoes, :publication
end
end
endAnd an import csv looked like this:
id,author,publisher_id,publisher_name
1,Macaulay,83,NY Tiempo,reviews_0_id,8,reviews_0_tomatoes,50,review_0_publication,Daily
The speaky will find a Booking record with id 1 and call:
booking.reviews_attributes = [{id: '8', tomatoes: '50', publication: 'Daily'}]Log Messages
Importers and exporters use a Logger instance to write messages during
processing. The default logger writes to a string that can be retrieved
by the #log method. A custom Logger can be set by the #logger=
method.
See Logger in the ruby stdlib for more details.
Best Practices
- Configure speaky with
idand_destroyfields for active record models - For associations, use
nested_attributes_forand addidand_destroyfields - Use optimistic locking and configure a
lock_versionfield - Consider building a draft or preview feature for importing which
doesn't persist the record by calling
savebut instead reports what the changes would be usingActiveModel::Dirty
TODO
- export only fields
- configurable id field (key off an
external_idfor example) - export validations
- attr import validations
- active record import validations
-
has_oneassociations - required fields (make
lock_versionrequired for example) - transformations for values via accessors on class
- public stable api for csv format definition
- assign attrs one at a time so they don't all fail together
- decide what empty cells mean
- figure out why SpeakyCsv is a class and not a module
Contributing
- Fork it ( http://github.com/ajh/speaky_csv/fork )
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create new Pull Request