Project

duckdb-csv

0.0
No release in over 3 years
This module provides CSV table adapter for duckdb. You can access CSV like as duckdb table by using this module.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

>= 3.2.0
>= 1.4.4.0
 Project Readme

ruby-duckdb-csv

Description

This gem duckdb-csv provides a CSV table adapter for ruby-duckdb. You can query Ruby's CSV objects using SQL through DuckDB by using this gem. This gem is a sample implementation of a DuckDB::TableFunction.

Requirement

How to install

gem install duckdb-csv

Or add the following line to your Gemfile:

gem 'duckdb-csv'

Usage

Basic usage

require 'duckdb'
require 'duckdb/csv/table_adapter'

db = DuckDB::Database.open
con = db.connect

csv_io = StringIO.new("id,name,age\n1,Alice,30\n2,Bob,25\n3,Charlie,35")
csv = CSV.new(csv_io, headers: true)

DuckDB::CSV::TableAdapter.register!

con.execute('SET threads=1') # currently the CSV table adapter is not thread-safe, so we set threads to 1
con.expose_as_table(csv, 'csv_table')
result = con.query('SELECT * FROM csv_table()')
result.each do |row|
  puts row.inspect
end
# => ["1", "Alice", "30"]
# => ["2", "Bob", "25"]
# => ["3", "Charlie", "35"]

Specifying column types

By default, all columns are treated as VARCHAR. You can specify column types using the columns option.

require 'duckdb'
require 'duckdb/csv/table_adapter'

db = DuckDB::Database.open
con = db.connect

csv_io = StringIO.new(<<~CSV.strip)
  id,name,age,height,birthday,created_at
  1,Alice,30,1.65,1990-01-02,2023-01-01T10:11:12
  2,Bob,25,1.80,1995-05-15,2024-02-03T11:12:13
  3,Charlie,35,1.75,1985-10-30,2025-04-05T12:13:14
CSV
csv = CSV.new(csv_io, headers: true)

DuckDB::CSV::TableAdapter.register!

con.execute('SET threads=1') # currently the CSV table adapter is not thread-safe, so we set threads to 1
con.expose_as_table(
  csv, 'csv_table',
  columns: {
    'id' => :integer,
    'name' => :varchar,
    'age' => :integer,
    'height' => :float,
    'birthday' => :date,
    'created_at' => :timestamp
  }
)

result = con.query('SELECT * FROM csv_table()')
result.each do |row|
  puts row.inspect
end
# => [1, "Alice", 30, 1.65, #<Date: 1990-01-02>, 2023-01-01 10:11:12 +0900]
# => [2, "Bob", 25, 1.80, #<Date: 1995-05-15>, 2024-02-03 11:12:13 +0900]
# => [3, "Charlie", 35, 1.75, #<Date: 1985-10-30>, 2025-04-05 12:13:14 +0900]

Headerless CSV

When CSV has no headers, columns are automatically named col1, col2, col3, etc.

csv_io = StringIO.new("1,Alice,30\n2,Bob,25\n3,Charlie,35")
csv = CSV.new(csv_io, headers: false)

DuckDB::CSV::TableAdapter.register!

con.execute('SET threads=1') # currently the CSV table adapter is not thread-safe, so we set threads to 1
con.expose_as_table(csv, 'csv_table')
result = con.query('SELECT col1, col2, col3 FROM csv_table()')
result.each do |row|
  puts row.inspect
end
# => ["1", "Alice", "30"]
# => ["2", "Bob", "25"]
# => ["3", "Charlie", "35"]

License

MIT License.