0.0
No release in over a year
Easy management of database dumps for dev env
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 5.2.0
>= 5.0.0
>= 1.20.0
>= 0.5.0
 Project Readme

DataKeeper

In a rails app using postgresql, DataKeeper is a tool to create dumps of your database in production to be used later on for local development.

It automates the process of creating and storing them on the server, and applying them locally afterwards.

It supports full dumps, as well as partial dumps per specific tables or even specific rows (you provide a sql select). On partial dumps, note you'll need to manage possible issues around foreign keys and maybe other constraints.

Installation

Add this line to your application's Gemfile:

gem 'data_keeper'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install data_keeper

Usage

Configure the storage to use to save the generated dumps.

You can use a local storage, a simple option which stores the dumps in the same server running the code, in a path of your choosing (consider that it must be writable by the user running this code in production). You also configure how to reach that server from your local machine (currently only scp is supported), in order to download these dumps later. Ex:

DataKeeper.storage = DataKeeper::LocalStorage.new(
  local_store_dir: "/users/fredy/backups/...",
  remote_access: {
    host: "10.10.10.10",
    port: "22",
    user: "user"
  }
)

There's also support for storing the dumps in s3, using DataKeeper::S3Storage like in this example:

# Explicit require is necessary
require 'data_keeper/s3_storage'

DataKeeper.storage = DataKeeper::S3Storage.new(
  bucket: 'bucket-name',
  store_dir: 'dumps/',
  acl: "private",
  remote_access: {
          access_key_id: Rails.application.credentials.access_key_id,
          secret_access_key: Rails.application.credentials.secret_access_key,
          region: 'eu-central-1'
  }
)

Other storages can be implemented. An storage can be any object that responds to those two methods:

  • #save(file, filename, dump_name), where file is a File object and filename and dump_name are strings. This method should save the given dump file in the store.

  • #retrieve(dump_name) { |file| (...) }, which should retrieve the latest stored dump with the given dump_name. It should yield the given block passing the File or Tempfile object pointing to the retrieved dump file in the local filesystem, which is expected to be cleaned up on block termination.

Then, declare some dumps to work with:

# Dump the whole database
DataKeeper.define_dump(:whole_database, :full)

# Dump only selected tables, and a custom SQL
DataKeeper.define_dump(:config) do |d|
  # Specific tables, all rows
  d.table "products"
  d.table "traits"

  # Only some rows in the "vouchers" table. MAKE SURE your sql returns only columns from the target table!
  d.sql(:vouchers, :used_vouchers) { Voucher.joins(cart: :order).where(orders: {status: "sent"}).to_sql }
  
  # Possible additional code to run after applying the dump locally
  d.on_after_load do
    User.create! email: "test@gmail.com", password: "password"
  end
end

Now, in production, you'll have run DataKeeper.create_dump!("config"), passing in the same of the dump you defined before. Running this will create the dump file, from the server you run this code from, and store it in the configured storage.

If you want to have always an up-to-date dump, you'll need to call this periodically, for example once per day.

Finally, to apply the dump locally, you can use the rake task:

bin/rake data_keeper:pull[config]

This will download the latest version available of the "config" dump, and apply it locally, destroying anything in your current database. It will give you an error if you try to run this in a production environment.

Note when using raw sql, your statement is expected to return all columns for the configured table, in the default order (select *). This uses pg's COPY from/to for the full table internally.

Docker

If you're using pg under docker, you can configure also DataKeeper with the docker pg settings and then this gem will use the binaries under that docker container (psql, pg_restore, etc.).

Since then the commands to execute are ran from within the docker instance, the port will be different as the one the rails app uses, so you'll need to configure what's the pg access from the docker container, use:

DataKeeper.docker_config = {
  instance_name: "pg_my_app",
  pg_host: "localhost",
  pg_port: "5432",
  pg_user: "myapp",
  pg_password: "myapp"
}

If the host, or user and password are the same as the ones used from the rails app, you can ignore them. If you configure this docker_config hash, then data keeper will try to use docker via docker exec to run the pg commands.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/rogercampos/data_keeper.

License

The gem is available as open source under the terms of the MIT License.