Project

seaduck

0.0
No release in over 3 years
Apache Iceberg for Ruby, powered by libduckdb
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Runtime

>= 0
 Project Readme

SeaDuck

Apache Iceberg for Ruby, powered by libduckdb

Build Status

Installation

First, install libduckdb. For Homebrew, use:

brew install duckdb

Then add this line to your application’s Gemfile:

gem "seaduck"

Getting Started

Create a client for an Iceberg catalog

catalog = SeaDuck::S3TablesCatalog.new(arn: "arn:aws:s3tables:...")

Note: SeaDuck requires a default namespace, which is main by default. This namespace is created if it does not exist. Pass default_namespace to use a different one.

Create a table

catalog.sql("CREATE TABLE events (id bigint, name text)")

Load data from a file

catalog.sql("COPY events FROM 'data.csv'")

You can also load data directly from other data sources

catalog.attach("blog", "postgres://localhost:5432/blog")
catalog.sql("INSERT INTO events SELECT * FROM blog.ahoy_events")

Query the data

catalog.sql("SELECT COUNT(*) FROM events").to_a

Namespaces

List namespaces

catalog.list_namespaces

Create a namespace

catalog.create_namespace("main")

Check if a namespace exists

catalog.namespace_exists?("main")

Drop a namespace

catalog.drop_namespace("main")

Tables

List tables

catalog.list_tables

Check if a table exists

catalog.table_exists?("events")

Drop a table

catalog.drop_table("events")

Snapshots

Get snapshots for a table

catalog.snapshots("events")

Query the data at a specific snapshot version or time

catalog.sql("SELECT * FROM events AT (VERSION => ?)", [3])
# or
catalog.sql("SELECT * FROM events AT (TIMESTAMP => ?)", [Date.today - 7])

SQL Safety

Use parameterized queries when possible

catalog.sql("SELECT * FROM events WHERE id = ?", [1])

For places that do not support parameters, use quote or quote_identifier

quoted_table = catalog.quote_identifier("events")
quoted_file = catalog.quote("path/to/data.csv")
catalog.sql("COPY #{quoted_table} FROM #{quoted_file}")

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/seaduck.git
cd seaduck
bundle install

# REST catalog
docker compose up
bundle exec rake test:rest

# S3 Tables catalog
bundle exec rake test:s3tables

# Glue catalog
bundle exec rake test:glue