0.0
No release in over 3 years
A blazing-fast Ruby extension for generating perceptual hashes and comparing them using Hamming distance.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 5.0
~> 13.0
 Project Readme

phash_native

A minimal, native, zero-dependency Ruby extension for computing perceptual hashes (pHash) from image files. Built entirely in C using stb_image.h, with no external dependencies outside the standard C library. Fast, simple, and does exactly one thing right.

Why this exists

There are existing Ruby gems that attempt to do perceptual hashing (like phashion), but:

  • They're old and abandonware.
  • They rely on broken or outdated C++ libraries.
  • They fail to compile on modern systems.
  • They use external dependencies like FFTW or libjpeg, and end up being fragile and drifting out of compatibility over time.

I needed something that worked. Something that could:

  • Read an image.
  • Compute a stable 64-bit perceptual hash based on DCT.
  • Compare hashes efficiently with Hamming distance.
  • Compile without screwing around with a million packages.

So I wrote it.

Installation

gem install phash_native

Or in your Gemfile:

gem "phash_native"

Usage

require "phash_native"

hash = PhashNative.compute("some/image.png")
# => 64-bit integer, e.g. 0x4f393c7331c7e7cc

# Compare two images
a = PhashNative.compute("image1.png")
b = PhashNative.compute("image2.png")
PhashNative.hamming(a, b)
# => integer between 0 and 64

Interpreting results

The closer two hashes are (i.e. the smaller the Hamming distance), the more visually similar the images are. A Hamming distance of 0 means they're identical in structure. Under 10 is usually a strong match. Over 20? Probably different content.

Performance

The native code uses:

  • A real 2D DCT on a 32×32 grayscale sample
  • A simple median threshold over the top-left 8×8 DCT block
  • Fast bitwise Hamming comparison (utilizes GCC intrinsics if available)

This means it runs fast — fast enough for batch processing thousands of frames without sweating.

Limitations

  • Only supports images readable by stb_image.h (which covers PNG, JPEG, BMP, etc.) But what more could you need?
  • Always converts to grayscale and downscales to 32×32 internally
  • No audio or video support — this is just for still images
  • No SIMD or threading — yet. PRs welcome.

Yes, you could write your own. But now you don’t have to.