AI Redactor
AI-powered redaction tool for automatically detecting and masking Personally Identifiable Information (PII) in text and images. Designed to help organizations comply with GDPR, HIPAA, and KVKK regulations.
Features
-
Text Analysis: Regex-based detection of PII including:
- Turkish National ID numbers
- IBAN numbers
- Credit card numbers
- Phone numbers
- Email addresses
- Social Security Numbers
- License plates
- IP addresses
- Passport numbers
- Tax ID numbers
- Bank account numbers
-
Flexible Masking Options:
- Custom mask characters
- Configurable mask length
- Format-preserving masking
- Pattern-specific filtering
-
Comprehensive Reporting:
- JSON reports with detection details
- Confidence scores for each detection
- Position information
- Summary statistics
-
CLI Interface: Command-line tool for batch processing
-
Developer-friendly API: Simple Ruby interface
Installation
Add this line to your application's Gemfile:
gem 'ai_redactor'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install ai_redactor
Usage
Basic Text Masking
require 'ai_redactor'
# Simple masking
text = "John Smith, National ID: 12345678901, IBAN: GB29NWBK60161331926819"
masked = AiRedactor.mask_text(text)
puts masked
# => "John Smith, National ID: **********, IBAN: ********"
# Custom masking options
masked = AiRedactor.mask_text(text,
mask_char: 'X',
mask_length: 4,
preserve_format: true
)
Detailed Analysis
# Get detailed analysis report
report = AiRedactor.analyze_text(text)
puts "Found #{report.detection_count} PII items"
puts "Detection types: #{report.detection_types.join(', ')}"
puts "Average confidence: #{report.summary[:average_confidence]}"
# Access individual detections
report.detections.each do |detection|
puts "#{detection[:type]}: #{detection[:original]} (confidence: #{detection[:confidence]})"
end
# Export as JSON
json_report = report.to_json
File.write('analysis_report.json', json_report)
Pattern Filtering
# Only detect specific patterns
email_only = AiRedactor.mask_text(text, patterns: [:email])
# Multiple specific patterns
financial_only = AiRedactor.mask_text(text, patterns: [:iban, :credit_card, :bank_account])
CLI Usage
# Basic masking
ai_redactor "Contact John at john@example.com or call 555-123-4567"
# Custom options
ai_redactor --mask-char X --mask-length 4 "Email: john@example.com"
# Detailed analysis
ai_redactor --analyze --format json "ID: 12345678901, Email: john@test.com"
# Save to file
ai_redactor --output report.txt --analyze "Sensitive data here"
# List available patterns
ai_redactor --list-patterns
# Filter specific patterns
ai_redactor --patterns email,phone "Contact info: john@test.com, 555-1234"
Configuration Options
Option | Description | Default |
---|---|---|
mask_char |
Character used for masking | "*" |
mask_length |
Length of mask | 8 |
preserve_format |
Preserve original format | false |
patterns |
Array of patterns to detect | All patterns |
case_sensitive |
Case-sensitive matching | false |
Supported PII Patterns
- turkish_id: Turkish National ID numbers (11 digits)
- iban: International Bank Account Numbers
- credit_card: Credit card numbers (various formats)
- phone: Phone numbers (international and local)
- email: Email addresses
- ssn: Social Security Numbers (US format)
- license_plate: License plate numbers
- ip_address: IP addresses
- passport: Passport numbers
- tax_id: Tax ID numbers
- bank_account: Bank account numbers
Privacy & Security
- Offline Processing: No data sent to external services
- No Data Storage: Text is processed in memory only
- Configurable: Full control over detection and masking
- Compliance Ready: Supports GDPR, HIPAA, and KVKK requirements
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt.
To install this gem onto your local machine, run bundle exec rake install
.
Running Tests
bundle exec rspec
Code Quality
bundle exec rubocop
Roadmap
- v0.1: ✅ Text redaction with regex patterns
- v0.2: 🔄 ONNX-powered face detection in images
- v0.3: 📋 REST/gRPC API service mode
- v1.0: 🎯 Full compliance suite with audit logs
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/ahmetxhero/ai_redactor.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
License
The gem is available as open source under the terms of the MIT License.
Support
- 📧 Email: ahmetxhero@gmail.com
- 🐛 Issues: GitHub Issues
- 📖 Documentation: GitHub Wiki
- 🌐 Portfolio: ahmetxhero.web.app
- 🐤 Twitter: @ahmetxhero
- 💼 LinkedIn: linkedin.com/in/ahmetxhero
Protecting Privacy, One Redaction at a Time 🛡️