0.0
The project is in a healthy, maintained state
Replicates PostgreSQL's default hash partitioning calculations. Specifically targets the logic within `hashint8extended` (for bigint) and `hashint4extended` (for integer/smallint) from PostgreSQL's `src/backend/access/hash/hashfunc.c`.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies
 Project Readme

pg_hash_func

CI Gem Version

This gem allows your application to calculate the target partition for integer keys in PostgreSQL hash-partitioned tables without querying the database. Typically, when querying a parent partitioned table, PostgreSQL consults catalog tables to route the query, incurring network latency and lookup costs. By replicating PostgreSQL's native hashing logic, this gem allows your application to bypass the parent table and directly derive the partition name/index, leading to significant performance gains in read-heavy, latency-sensitive workloads.

This gem replicates the hashing logic PostgreSQL's hashint8extended (for bigint) and hashint4extended (for integer and smallint) in src/backend/access/hash/hashfunc.c

Supported Types:

  • bigint (int8): Use PgHashFunc.calculate_partition_index_bigint.
  • integer (int4) and smallint (int2): Use PgHashFunc.calculate_partition_index_int4. (PostgreSQL uses the same underlying hash function for both int4 and int2.)

Limitations:

  • Only replicates the default hash partitioning strategy.
  • Only supports integer-based keys (bigint, integer, smallint).
  • Does not support hashing other data types (text, dates, floats, etc.).
  • Does not support other partitioning strategies (list, range).
  • Assumes PostgreSQL's standard internal seed and magic constants by default.

Compatibility:

  • Ruby >= 3.0.0
  • PostgreSQL >= 11 (tested up to 16)

Simple benchmark here

Installation

Add this line to your application's Gemfile:

gem 'pg_hash_func'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install pg_hash_func

Usage

Example 1: Partitioning by bigint (e.g., User ID)

TABLE_PREFIX_BIGINT = "events"
USER_ID = 123_456_789_012_345 # bigint value
NUM_PARTITIONS_BIGINT = 16
index_bigint = PgHashFunc.calculate_partition_index_bigint(
  value: USER_ID,
  num_partitions: NUM_PARTITIONS_BIGINT
)

# Construct the partition table name
partition_name_bigint = [TABLE_PREFIX_BIGINT, index_bigint].join("_")

puts "User #{USER_ID} (bigint) belongs to partition: #{partition_name_bigint}"
# => User 123456789012345 (bigint) belongs to partition: events_14

Example 2: Partitioning by integer (e.g., Tenant ID)

TABLE_PREFIX_INT = "tenant_data"
TENANT_ID = 987_654 # An integer value (fits in int4/int2)
NUM_PARTITIONS_INT = 32

# Calculate the index using the int4 function
# This also works correctly if TENANT_ID was a smallint
index_int = PgHashFunc.calculate_partition_index_int4(
  value: TENANT_ID,
  num_partitions: NUM_PARTITIONS_INT
)

partition_name_int = [TABLE_PREFIX_INT, index_int].join("_")

puts "Tenant 987654 (int) belongs to partition: tenant_data_28"
# => "tenant_data_22"

Example 3: Two-Level Partitioning (bigint then int4)

TABLE_PREFIX_MULTI = "user_settings"
CUSTOMER_ID = 555_444_333_222_111 # bigint
SETTING_TYPE = 101                # integer
NUM_PARTITIONS_L1 = 64            # For CUSTOMER_ID
NUM_PARTITIONS_L2 = 8             # For SETTING_TYPE

# Calculate index for each level separately using the correct function
index_l1 = PgHashFunc.calculate_partition_index_bigint(value: CUSTOMER_ID, num_partitions: NUM_PARTITIONS_L1)
index_l2 = PgHashFunc.calculate_partition_index_int4(value: SETTING_TYPE, num_partitions: NUM_PARTITIONS_L2)

partition_name_multi = [TABLE_PREFIX_MULTI, index_l1, index_l2].join("_")

puts "Settings for Customer=#{CUSTOMER_ID}, Type=#{SETTING_TYPE} belong to: #{partition_name_multi}"
# => Settings for Customer=555444333222111, Type=101 belong to: user_settings_44_0

Raw Hash Function

Access the underlying PostgreSQL hashint8extended function directly. Primarily useful for debugging or specific integration scenarios.

USER_ID = 123_456_789_012_345 # From Example 1

raw_hash_bigint = PgHashFunc.hashint8extended(value: USER_ID)

puts "Raw bigint hash for #{USER_ID}: #{raw_hash_bigint}"
# => Raw bigint hash for 123456789012345: 1245190300417211467

Development

After checking out the repo, run bin/setup to install dependencies. Then, run bundle exec rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install.

Releasing

  1. Update the VERSION constant in lib/pg_hash_func/version.rb.
  2. Commit the changes.
  3. Run the release script, providing the version number:
    scripts/release.sh <VERSION>
    # e.g., scripts/release.sh 0.1.0
    This script will:
    • Build the gem.
    • Push the gem to RubyGems.
    • Create a git tag (e.g., v0.1.0).
    • Push the tag to GitHub.
    • Clean up the local gem file.
  4. Create a release on GitHub using the tag created.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/shayonj/pg_hash_func.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the PgHashFunc project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.