Swhid
A Ruby library and CLI for generating and parsing SoftWare Hash IDentifiers (SWHIDs).
SWHIDs are persistent, intrinsic identifiers for software artifacts such as files, directories, commits, releases, and snapshots. They are content-based identifiers that use Merkle DAGs for tamper-proof identification with built-in integrity verification.
This implementation follows the official SWHID specification v1.2 (ISO/IEC 18670:2025).
Features
- Generate SWHIDs for all object types:
- Content (cnt) - files and blobs
- Directory (dir) - directory trees
- Revision (rev) - commits
- Release (rel) - tags and releases
- Snapshot (snp) - repository snapshots
- Parse and validate SWHID strings
- Support for qualifiers (origin, visit, anchor, path, lines, bytes)
- Command-line interface for easy integration
- Git-compatible hash computation
- Comprehensive test suite
Installation
Add this line to your application's Gemfile:
gem 'swhid'And then execute:
bundle installOr install it yourself as:
gem install swhidUsage
Library Usage
Parsing SWHIDs
require 'swhid'
# Parse a SWHID string
swhid = Swhid.parse("swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2")
puts swhid.scheme # => "swh"
puts swhid.version # => 1
puts swhid.object_type # => "cnt"
puts swhid.object_hash # => "94a9ed024d3859793618152ea559a168bbcbb5e2"
puts swhid.to_s # => "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2"
# Parse SWHID with qualifiers
swhid = Swhid.parse("swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2;origin=https://github.com/example/repo;lines=5-10")
puts swhid.qualifiers[:origin] # => "https://github.com/example/repo"
puts swhid.qualifiers[:lines] # => "5-10"Generating SWHIDs
Content (Files)
# From a file
content = File.read("example.txt")
swhid = Swhid.from_content(content)
puts swhid.to_s # => "swh:1:cnt:..."
# Empty file
swhid = Swhid.from_content("")
puts swhid.to_s # => "swh:1:cnt:e69de29bb2d1d6434b8b29ae775ad8c2e48c5391"Directory
entries = [
{ name: "README.md", type: :file, target: "94a9ed024d3859793618152ea559a168bbcbb5e2" },
{ name: "src", type: :dir, target: "4b825dc642cb6eb9a060e54bf8d69288fbee4904" },
{ name: "script.sh", type: :exec, target: "84a9ed024d3859793618152ea559a168bbcbb5e1" }
]
swhid = Swhid.from_directory(entries)
puts swhid.to_sRevision (Commit)
metadata = {
directory: "4b825dc642cb6eb9a060e54bf8d69288fbee4904",
author: "John Doe <john@example.com>",
author_timestamp: 1234567890,
author_timezone: "+0000",
committer: "Jane Smith <jane@example.com>",
committer_timestamp: 1234567890,
committer_timezone: "+0000",
message: "Initial commit",
parents: [] # Optional
}
swhid = Swhid.from_revision(metadata)
puts swhid.to_sRelease (Tag)
metadata = {
name: "v1.0.0",
target: { hash: "94a9ed024d3859793618152ea559a168bbcbb5e2", type: "rev" },
author: "John Doe <john@example.com>",
author_timestamp: 1234567890,
message: "Release version 1.0.0"
}
swhid = Swhid.from_release(metadata)
puts swhid.to_sSnapshot
branches = [
{ name: "refs/heads/main", target_type: "revision", target: "94a9ed024d3859793618152ea559a168bbcbb5e2" },
{ name: "refs/tags/v1.0", target_type: "release", target: "84a9ed024d3859793618152ea559a168bbcbb5e1" },
{ name: "HEAD", target_type: "alias", target: "refs/heads/main" }
]
swhid = Swhid.from_snapshot(branches)
puts swhid.to_sWorking with Qualifiers
# Create SWHID with qualifiers
swhid = Swhid::Identifier.new(
object_type: "cnt",
object_hash: "94a9ed024d3859793618152ea559a168bbcbb5e2",
qualifiers: {
origin: "https://github.com/example/repo",
visit: "swh:1:snp:d7f1b9eb7ccb596c2622c4780febaa02549830f9",
anchor: "swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0",
path: "/src/main.rb",
lines: "10-20",
bytes: "0-100"
}
)
puts swhid.to_s
# => "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2;origin=https://github.com/example/repo;visit=swh:1:snp:...;anchor=swh:1:rev:...;path=/src/main.rb;lines=10-20;bytes=0-100"CLI Usage
The gem includes a command-line tool for working with SWHIDs:
Parse a SWHID
$ swhid parse "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2"
SWHID: swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2
Core: swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2
Type: cnt
Hash: 94a9ed024d3859793618152ea559a168bbcbb5e2Generate SWHID from file content
$ cat file.txt | swhid content
swh:1:cnt:9daeafb9864cf43055ae93beb0afd6c7d144bfa4
$ echo "Hello, World!" | swhid content
swh:1:cnt:96898574d1b88e619be24fd90bb4cd399acbc5caGenerate SWHID from directory
$ swhid directory /path/to/directory
swh:1:dir:4b825dc642cb6eb9a060e54bf8d69288fbee4904Generate SWHID from git commit
$ swhid revision /path/to/repo
swh:1:rev:bc0195aad0daa2ad5b0d76cce22b167bc3435590
$ swhid revision /path/to/repo main
swh:1:rev:bc0195aad0daa2ad5b0d76cce22b167bc3435590
$ swhid revision /path/to/repo abc123
swh:1:rev:bc0195aad0daa2ad5b0d76cce22b167bc3435590Generate SWHID from git tag
$ swhid release /path/to/repo v1.0.0
swh:1:rel:2b10839e32c4c476e9d94492756bb1a3e1ec4aa8Generate SWHID from git snapshot
$ swhid snapshot /path/to/repo
swh:1:snp:6e65b86363953b780d92b0a928f3e8fcdd10db36Add qualifiers
$ cat file.txt | swhid content -q origin=https://github.com/example/repo -q lines=1-10
swh:1:cnt:9daeafb9864cf43055ae93beb0afd6c7d144bfa4;origin=https://github.com/example/repo;lines=1-10JSON output
$ swhid parse "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2" -f json
{
"swhid": "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2",
"core": "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2",
"object_type": "cnt",
"object_hash": "94a9ed024d3859793618152ea559a168bbcbb5e2",
"qualifiers": {}
}Object Types
- cnt (content): Individual files or blobs
- dir (directory): Directory trees with entries
- rev (revision): Git commits or equivalent
- rel (release): Tags or releases
- snp (snapshot): Repository snapshots at a point in time
Qualifiers
SWHIDs can include optional qualifiers to provide context:
- origin: URL of the software origin
- visit: Core SWHID of the snapshot when visited
- anchor: Core SWHID of the anchor node (directory, revision, release, or snapshot)
- path: Absolute file path from the root directory
- lines: Line range (e.g., "10-20")
- bytes: Byte range (e.g., "100-500")
Git Compatibility
The hash computation for content, directory, revision, and release objects is compatible with Git's object hashing. This means you can use this gem to compute the same hashes that Git would produce for the same objects.
Development
After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/andrew/swhid. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
Code of Conduct
Everyone interacting in the Swhid project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.