The project is in a healthy, maintained state
A bulk deletion tool that deletes records and their dependencies without instantiation
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 3.2
~> 6.1
~> 3.9
~> 1.4
~> 3.4

Runtime

>= 6.1
 Project Readme

bulk_dependency_eraser

Delete records in bulk, and their dependencies, without instantiation or callbacks.

Install (add to Gemfile)

gem 'bulk_dependency_eraser'

WARNINGS!

ActiveRecord::InvalidForeignKey

To accomplish efficient mass deletion, we suppress ActiveRecord::InvalidForeignKey errors. It's upon you to ensure that your dependency trees in your models are set up properly, so as not to leave orphaned records. You can disable this suppression, but you may run into deletion order issues.

  • see :enable_invalid_foreign_key_detection option

Rollbacks

  • In v1.X, we used to run all deletions and nullifications in their own transaction blocks, but this appears to be causing some table locking issues. No longer using transaction blocks for this reason, and can no longer support rollbacks.
  • You can still enable rollbacks if you want by passing in these two wrapper options.
opts: {
  db_delete_all_wrapper: ->(deleter, block) {
    ActiveRecord::Base.transaction do
      begin
        block.call # execute deletions
      rescue BulkDependencyEraser::Errors::DeleterError => e
        deleter.report_error(
          <<~STRING.strip
          Issue attempting to delete klass '#{e.deleting_klass_name}'
            => #{e.original_error_klass.name}: #{e.message}
          STRING
        )
        raise ActiveRecord::Rollback
      end
    end
  },
  db_nullify_all_wrapper: ->(nullifier, block) {
    ActiveRecord::Base.transaction do
      begin
        block.call # execute nullifications
      rescue StandardError => e
        nullifier.report_error(
          <<~STRING.strip
          Issue attempting to nullify klass '#{e.nullifying_klass_name}' on column(s) '#{e.nullifying_columns}'
            => #{e.original_error_klass.name}: #{e.message}
          STRING
        )
        raise ActiveRecord::Rollback
      end
    end
  }
}
BulkDependencyEraser::Manager.new(query: User.where(id: [...]), opts:).execute

Example 1:

# Delete all queried users and their dependencies.
query = User.where(id: [...])
manager = BulkDependencyEraser::Manager.new(query:)
manager.execute #=> true/false, depending on if successful.

Example 2:

# To see the dependency tree actualized as ids mapped by class name
query = User.where(id: [...])
manager = BulkDependencyEraser::Manager.new(query:)
manager.build #=> true/false, depending on if successful.

# To see the Class/ID deletion data
puts manager.deletion_list

# To see the Class/Column/ID data
# - It would nullify those columns for those class on those IDs.
puts manager.nullification_list

# If there are any errors encountered, the deletion/nullification will not take place.
# You can see any errors here:
puts manager.errors

Data structure requirements

  • Requires all query and dependency tables to have an 'id' column.
  • This logic also requires that all the rails model association scopes not have parameters
    • We would need to instantiate the records to resolve those.
    • If you have to have association scopes with instance-level parameters, see the :instantiate_if_assoc_scope_with_arity option documentation.
  • If any of these requirements are not met, an error will be reported and the deletion/nullification will not take effect.

Options - Passing Them In

# pass options as :opts keyword arg
# - also valid for any other BulkDependencyEraser classes
opts = {<...>}
manager = BulkDependencyEraser::Manager.new(query:, opts:)

Additional Options:

Option: Ignore Tables

# Ignore tables
# - will still go through those tables to locate dependencies
opts: { ignore_tables: [User.table_name, <other_table_name>, ...] }
# - Those ignored table build results will be accessible via the following.
#   - Useful for handling those deletions with your own logic.
manager.ignore_table_deletion_list
manager.ignore_table_nullification_list

# You can delete/nullify these ignored tables manually:
deleter = BulkDependencyEraser::Deleter.new(
  class_names_and_ids: manager.ignore_table_deletion_list,
  opts:
)
deleter.execute
nullifier = BulkDependencyEraser::Nullifier.new(
  class_names_columns_and_ids: manager.ignore_table_nullification_list,
  opts:
)
nullifier.execute

Option: Ignore Tables and Their Dependencies

# Ignore tables and their dependencies
# - will NOT go through those tables to locate dependencies
# - this option will not populate the 'ignore_table_nullification_list', 'ignore_table_deletion_list' lists
#   - We don't parse them, so they are not added
opts: { ignore_tables_and_dependencies: [<table_name>, ...] }

Option: Ignore Classes and Their Dependencies

# Ignore class names and their dependencies
# - will NOT go through those tables to locate dependencies
# - this option will not populate the 'ignore_table_nullification_list', 'ignore_table_deletion_list' lists
#   - We don't parse them, so they are not added
opts: { ignore_klass_names_and_dependencies: [<class_name>, ...] }

Option: Enable 'ActiveRecord::InvalidForeignKey' errors

# During mass, unordered deletions, sometimes 'ActiveRecord::InvalidForeignKey' errors would be raised.
# - We can't guarantee deletion order. 
# - Currently we delete in order of leaf to root klasses (Last In, First Out)
# - Ordering with self-referential associations or circular-model dependencies is problematic
#   - only in terms of ordering though. We can easily parse circular or self-referential dependencies
#
# Therefore we use 'ActiveRecord::Base.connection.disable_referential_integrity' for deletions
# However, you can disable this by passing this value in options:
opts: { enable_invalid_foreign_key_detection: true }

Option: Destroy 'restrict_with_error' or 'restrict_with_exeception' dependencies

# To delete associations with dependency values 'restrict_with_error' or 'restrict_with_exception',
# use the following option:
# - otherwise an error will be reported and deletions/nullifications will not occur
opts: { force_destroy_restricted: true }

Option: Database Wrappers

You can wrap your database calls using the following options.

# You can pass your own procs if you wish to use different database call wrappers.
# By default, the database reading will be done through the :reading role
DATABASE_READ_WRAPPER = ->(block) do
  ActiveRecord::Base.connected_to(role: :reading) do
    block.call
  end
end

opts: { db_read_wrapper: DATABASE_READ_WRAPPER }

# By default, the database deletion and nullification will be done the :writing role
# You can override each wrapper individually.
DATABASE_WRITE_WRAPPER = ->(block) do
  ActiveRecord::Base.connected_to(role: :writing) do
    block.call
  end
end

# Deletion wrapper
opts: { db_delete_wrapper: DATABASE_WRITE_WRAPPER }

# Column Nullification wrapper
opts: { db_nullify_wrapper: DATABASE_WRITE_WRAPPER }

Option: Batching (reading, deleting, nullifying)

# You can pass custom batch limits.

# Reading default: 10,000
# Deleting default: 300
# Nullification default: 300
opts: { batch_size: <Integer> } # will be applied to all actions (reading/deleting/nullifying)

opts: { read_batch_size: <Integer> }    # will be applied to reading (and will override :batch_size for reading)

opts: { delete_batch_size: <Integer> }  # will be applied to deleting (and will override :batch_size for deleting)

opts: { nullify_batch_size: <Integer> } # will be applied to nullifying (and will override :batch_size for deleting)

opts: { disable_batching: <Boolean> } #  Disable all batching in reading/deleting/nullifying

Option: Batching (reading) without ordering

# Sometimes you have a table so big that you can't afford to order it before reading it.
# These options will disable any batching order and merely trust the order that the DB returns the records are consistent.
# - This can leave behind orphaned records. Use at your own risk!
opts: { disable_batch_ordering: true } # Will remove batch ordering for all (overrides 'disable_batch_ordering_for_klasses').
opts: { disable_batch_ordering_for_klasses: [<String>, ...] } # Will only remove batch ordering for the contained classes

Option: Deleting with PG system column CTID

# Postgresql databases support a CTID column. It represents the location of the tuple in the database.
# - They are subject to change and not reliable long-term.
# - However, using them in addition to the ID column can result in a 10x deletion speed increase.
#   - Since we're still deleting by ID, there's no chance that the wrong record will be deleted. Just a small chance we may miss a record deletion.
# - It's your responsibility to know what you're doing if you enable this option!
opts: { use_pg_system_column_ctid: true }
# IF partitioning is implemented, you MUST use this option (delete_ctids_by_partions) as well!
# - to do otherwise would result in unintended records being deleted. CTIDs are not unique across partitions!
# - somewhat experimental! We've had to do some mocks, due to using SQLite3 for testing, but it should be valid.
opts: { delete_ctids_by_partions: true, use_pg_system_column_ctid: true }

# WARNING: Write your own tests that cover usage of CTID deletion!

Option: Query Scoping

# Can add additional query filters/scopes.
# - Useful when taking advantage of indexes.
# - Once a scope is applied, and returns a ActiveRecord::Relation, no further scopes are applied
# - scopes accept a lambda with a ActiveRecord::Relation param. Returns either nil (ignores scope and continues other scope-checks) or ActiveRecord::Relation (scope applied)
# Global Scopes (across reading, deleting, nullifying)
opts: {
  # 4th priority (applied to all queries)
  proc_scopes: ->(query) { query.where(tenant_id: "...") }
  # 3rd priority (applied when klass name matches query's klass)
  proc_scopes_per_class_name: {
    <klass name>: ->(query) { query.where(tenant_id: "...") }
  }
}

# Reading Only
opts: {
  # 2nd priority
  reading_proc_scopes: ->(query) { query.where(tenant_id: "...") }
  # 1st priority
  reading_proc_scopes_per_class_name: {
    <klass name>: ->(query) { query.where(tenant_id: "...") }
  }
}

# Deleting Only
opts: {
  # 2nd priority
  deletion_proc_scopes: ->(query) { query.where(tenant_id: "...") }
  # 1st priority
  deletion_proc_scopes_per_class_name: {
    <klass name>: ->(query) { query.where(tenant_id: "...") }
  }
}

# Nullification Only
opts: {
  # 2nd priority
  nullification_proc_scopes: ->(query) { query.where(tenant_id: "...") }
  # 1st priority
  nullification_proc_scopes_per_class_name: {
    <klass name>: ->(query) { query.where(tenant_id: "...") }
  }
}

TODO: Option: Instantiation

  • Feature currently is in development
# You have an association with instance-level parameters in it's association scope.
# - You can utilize the :instantiate_if_assoc_scope_with_arity option to have this gem 
#   instantiate those parent records to resolve and pluck the IDs of those associations
# - It will not have the same dependency tree parsing speed that you've come to know and love
opts: { instantiate_if_assoc_scope_with_arity: true }

# You can also set the batching, default 500, for those record instantiations
opts: {
  instantiate_if_assoc_scope_with_arity: true,
  batching_size_limit: 500
}