rzstd
Ractor-safe Zstandard bindings for Ruby with persistent contexts.
rzstd provides Zstd frame compress/decompress at module level and a
stateful Dictionary class for dict-bound compression. Internally it
holds onto ZSTD_CCtx / ZSTD_DCtx state across calls instead of
allocating fresh ~256 KB contexts every time, which is what makes it
viable for small-message workloads where the upstream zstd-ruby gem
loses to LZ4 purely on context-allocation overhead.
API mirrors rlz4 0.2.x:
require "rzstd"
# Module-level frame compression
ct = RZstd.compress("the quick brown fox", level: 3) # level: kwarg, default 3
RZstd.decompress(ct) # => "the quick brown fox"
# Negative levels enable Zstd's fast strategy (trades ratio for speed).
# Supported range: -131072..22. Typical useful range: -7..19.
RZstd.compress(payload, level: -3) # fast strategy, low ratio
RZstd.compress(payload, level: 19) # high ratio, slow
# Dict-bound compression
dict = RZstd::Dictionary.new(File.binread("schema.dict"), level: -3)
dict.id # => u32 Dict_ID
dict.size # => byte length
dict.compress("payload that shares the schema")
dict.decompress(ct)
# Dictionary training from sample payloads (wraps ZDICT_trainFromBuffer).
# Gather representative messages, then train a dictionary once and reuse
# it on both peers. Small-message workloads benefit the most.
samples = 1000.times.map { generate_sample_message }
dict_bytes = RZstd::Dictionary.train(samples, capacity: 64 * 1024)
dict = RZstd::Dictionary.new(dict_bytes)Dictionary IDs
Dictionary#id returns a u32 following the Zstandard spec's
Dictionary_ID semantics:
-
ZDICT-format dicts (the output of
Dictionary.train, or any bytes starting with the ZDICT magic0xEC30A437LE): the id is read straight out of header bytes[4..7]. This is the same id zstd writes into every compressed frame header viaZSTD_c_dictIDFlag(on by default), soDictionary#idand the on-wire frameDictionary_IDalways agree. Receivers can therefore route incoming frames to the right dictionary purely by parsing the frame header — no side channel required. -
Raw-content dicts (opaque bytes with no ZDICT header): the spec
requires the on-wire frame
Dictionary_IDto be0, sorzstdsynthesises a local id fromsha256(bytes)mapped into the public range32_768..(2**31 - 1)— avoiding both reserved ranges (0..32_767, reserved for a future registrar, and>= 2**31). This id is useful as an in-process handle; it is not on the wire, so peers that need to agree on raw-content dicts must share them out-of-band.
Public constants RZstd::Dictionary::USER_DICT_ID_MIN /
USER_DICT_ID_MAX / USER_DICT_ID_SIZE expose the private range
for callers that generate their own ids.
Wrong-dict decoding is caught by the content checksum the encoder
enables — a peer using the wrong dictionary raises
RZstd::DecompressError instead of returning corrupt bytes.
Ractor safety
The extension is marked Ractor-safe. Dictionary instances are
shareable. Module-level RZstd.compress / RZstd.decompress use a
single global CCtx / DCtx behind a Mutex, which serializes
calls across Ractors — if you need parallel throughput, give each
Ractor its own Dictionary (each one owns its own per-instance
contexts).