0.0
The project is in a healthy, maintained state
A lightweight Ruby library that automatically detects and redacts sensitive information like national ID numbers, IBANs, phone numbers, emails, and faces in documents and images. Supports GDPR, HIPAA, and KVKK compliance.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

>= 1.17
~> 13.0
~> 3.0
~> 1.0

Runtime

~> 2.0
 Project Readme

AI Redactor

Gem Version Build Status

AI-powered redaction tool for automatically detecting and masking Personally Identifiable Information (PII) in text and images. Designed to help organizations comply with GDPR, HIPAA, and KVKK regulations.

Features

  • Text Analysis: Regex-based detection of PII including:

    • Turkish National ID numbers
    • IBAN numbers
    • Credit card numbers
    • Phone numbers
    • Email addresses
    • Social Security Numbers
    • License plates
    • IP addresses
    • Passport numbers
    • Tax ID numbers
    • Bank account numbers
  • Flexible Masking Options:

    • Custom mask characters
    • Configurable mask length
    • Format-preserving masking
    • Pattern-specific filtering
  • Comprehensive Reporting:

    • JSON reports with detection details
    • Confidence scores for each detection
    • Position information
    • Summary statistics
  • CLI Interface: Command-line tool for batch processing

  • Developer-friendly API: Simple Ruby interface

Installation

Add this line to your application's Gemfile:

gem 'ai_redactor'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install ai_redactor

Usage

Basic Text Masking

require 'ai_redactor'

# Simple masking
text = "John Smith, National ID: 12345678901, IBAN: GB29NWBK60161331926819"
masked = AiRedactor.mask_text(text)
puts masked
# => "John Smith, National ID: **********, IBAN: ********"

# Custom masking options
masked = AiRedactor.mask_text(text, 
  mask_char: 'X', 
  mask_length: 4,
  preserve_format: true
)

Detailed Analysis

# Get detailed analysis report
report = AiRedactor.analyze_text(text)

puts "Found #{report.detection_count} PII items"
puts "Detection types: #{report.detection_types.join(', ')}"
puts "Average confidence: #{report.summary[:average_confidence]}"

# Access individual detections
report.detections.each do |detection|
  puts "#{detection[:type]}: #{detection[:original]} (confidence: #{detection[:confidence]})"
end

# Export as JSON
json_report = report.to_json
File.write('analysis_report.json', json_report)

Pattern Filtering

# Only detect specific patterns
email_only = AiRedactor.mask_text(text, patterns: [:email])

# Multiple specific patterns
financial_only = AiRedactor.mask_text(text, patterns: [:iban, :credit_card, :bank_account])

CLI Usage

# Basic masking
ai_redactor "Contact John at john@example.com or call 555-123-4567"

# Custom options
ai_redactor --mask-char X --mask-length 4 "Email: john@example.com"

# Detailed analysis
ai_redactor --analyze --format json "ID: 12345678901, Email: john@test.com"

# Save to file
ai_redactor --output report.txt --analyze "Sensitive data here"

# List available patterns
ai_redactor --list-patterns

# Filter specific patterns
ai_redactor --patterns email,phone "Contact info: john@test.com, 555-1234"

Configuration Options

Option Description Default
mask_char Character used for masking "*"
mask_length Length of mask 8
preserve_format Preserve original format false
patterns Array of patterns to detect All patterns
case_sensitive Case-sensitive matching false

Supported PII Patterns

  • turkish_id: Turkish National ID numbers (11 digits)
  • iban: International Bank Account Numbers
  • credit_card: Credit card numbers (various formats)
  • phone: Phone numbers (international and local)
  • email: Email addresses
  • ssn: Social Security Numbers (US format)
  • license_plate: License plate numbers
  • ip_address: IP addresses
  • passport: Passport numbers
  • tax_id: Tax ID numbers
  • bank_account: Bank account numbers

Privacy & Security

  • Offline Processing: No data sent to external services
  • No Data Storage: Text is processed in memory only
  • Configurable: Full control over detection and masking
  • Compliance Ready: Supports GDPR, HIPAA, and KVKK requirements

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt.

To install this gem onto your local machine, run bundle exec rake install.

Running Tests

bundle exec rspec

Code Quality

bundle exec rubocop

Roadmap

  • v0.1: ✅ Text redaction with regex patterns
  • v0.2: 🔄 ONNX-powered face detection in images
  • v0.3: 📋 REST/gRPC API service mode
  • v1.0: 🎯 Full compliance suite with audit logs

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/ahmetxhero/ai_redactor.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

The gem is available as open source under the terms of the MIT License.

Support


Protecting Privacy, One Redaction at a Time 🛡️