π BetterTranslate
AI-powered YAML locale file translator for Rails and Ruby projects
BetterTranslate automatically translates your YAML locale files using cutting-edge AI providers (ChatGPT, Google Gemini, and Anthropic Claude). It's designed for Rails applications but works with any Ruby project that uses YAML-based internationalization.
π― Why BetterTranslate?
- β Production-Ready: Tested with real APIs via VCR cassettes (18 cassettes, 260KB)
- β
Interactive Demo: Try it in 2 minutes with
ruby spec/dummy/demo_translation.rb - β
Variable Preservation:
%{name}placeholders maintained in translations - β Nested YAML Support: Complex structures preserved perfectly
- β Multiple Providers: Choose ChatGPT, Gemini, or Claude
| Provider | Model | Speed | Quality | Cost |
|---|---|---|---|---|
| ChatGPT | GPT-5-nano | β‘β‘β‘ Fast | βββββ Excellent | π°π° Medium |
| Gemini | gemini-2.0-flash-exp | β‘β‘β‘β‘ Very Fast | ββββ Very Good | π° Low |
| Claude | Claude 3.5 | β‘β‘ Medium | βββββ Excellent | π°π°π° High |
β¨ Features
Core Translation Features
- π€ Multiple AI Providers: Support for ChatGPT (GPT-5-nano), Google Gemini (gemini-2.0-flash-exp), and Anthropic Claude
- β‘ Intelligent Caching: LRU cache with optional TTL reduces API costs and speeds up repeated translations
- π Translation Modes: Choose between override (replace entire files) or incremental (merge with existing translations)
- π― Smart Strategies: Automatic selection between deep translation (< 50 strings) and batch translation (β₯ 50 strings)
- π« Flexible Exclusions: Global exclusions for all languages + language-specific exclusions for fine-grained control
- π¨ Translation Context: Provide domain-specific context for medical, legal, financial, or technical terminology
- π Similarity Analysis: Built-in Levenshtein distance analyzer to identify similar translations
- π Orphan Key Analyzer: Find unused translation keys in your codebase with comprehensive reports (text, JSON, CSV)
New in v1.1.1 π§
- π Automatic File Creation: Input files are automatically created if they don't exist
- π§ Initializer Priority: Rake task now checks for initializer configuration before YAML config
- π Fixed Loop Issues: Removed problematic
after_initializehook that caused deadlocks - π Ruby 3.4.0 Support: Added explicit CSV dependency for compatibility
New in v1.1.0 π
- ποΈ Provider-Specific Options: Fine-tune AI behavior with
model,temperature, andmax_tokens - πΎ Automatic Backups: Configurable backup rotation before overwriting files (
.bak,.bak.1,.bak.2) - π¦ JSON Support: Full support for JSON locale files (React, Vue, modern JS frameworks)
- β‘ Parallel Translation: Translate multiple languages concurrently with thread-based execution
- π Multiple Files: Translate multiple files with arrays or glob patterns (
**/*.en.yml)
Development & Quality
- π§ͺ Comprehensive Testing: Unit tests + integration tests with VCR cassettes (18 cassettes, 260KB)
- π¬ Rails Dummy App: Interactive demo with real translations (
ruby spec/dummy/demo_translation.rb) - π VCR Integration: Record real API responses, test without API keys, CI/CD friendly
- π‘οΈ Type-Safe Configuration: Comprehensive validation with detailed error messages
- π YARD Documentation: Complete API documentation with examples
- π Retry Logic: Exponential backoff for failed API calls (3 attempts, configurable)
- π¦ Rate Limiting: Thread-safe rate limiter prevents API overload
π Quick Start
Try It Now (Interactive Demo)
Clone the repo and run the demo to see BetterTranslate in action:
git clone https://github.com/alessiobussolari/better_translate.git
cd better_translate
bundle install
# Set your OpenAI API key
export OPENAI_API_KEY=your_key_here
# Run the demo!
ruby spec/dummy/demo_translation.rbWhat happens:
- β
Reads
en.ymlwith 16 translation keys - β Translates to Italian and French using ChatGPT
- β
Generates
it.ymlandfr.ymlfiles - β Shows progress, results, and sample translations
- β Takes ~2 minutes (real API calls)
Sample Output:
# en.yml (input)
en:
hello: "Hello"
users:
greeting: "Hello %{name}"
# it.yml (generated) β
it:
hello: "Ciao"
users:
greeting: "Ciao %{name}" # Variable preserved!
# fr.yml (generated) β
fr:
hello: "Bonjour"
users:
greeting: "Bonjour %{name}" # Variable preserved!See spec/dummy/USAGE_GUIDE.md for more examples.
Rails Integration
# config/initializers/better_translate.rb
BetterTranslate.configure do |config|
config.provider = :chatgpt
config.openai_key = ENV["OPENAI_API_KEY"]
# IMPORTANT: Set these manually to match your Rails I18n configuration
# (I18n.default_locale and I18n.available_locales are not yet available)
config.source_language = "en" # Should match config.i18n.default_locale
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "es", name: "Spanish" }
]
config.input_file = "config/locales/en.yml"
config.output_folder = "config/locales"
# Optional: Provide context for better translations
config.translation_context = "E-commerce application with product catalog"
end
# Translate all files
BetterTranslate.translate_allπ¦ Installation
Add this line to your application's Gemfile:
gem "better_translate"And then execute:
bundle installOr install it yourself as:
gem install better_translateRails Integration
For Rails applications, generate the initializer:
rails generate better_translate:installThis creates config/initializers/better_translate.rb with example configuration for all supported providers.
Important Notes (v1.1.1+):
- The initializer now uses manual language configuration instead of
I18n.default_locale - You must set
source_languageandtarget_languagesto match yourconfig/application.rbI18n settings - This prevents loop/deadlock issues when running rake tasks
- Input files are automatically created if they don't exist
βοΈ Configuration
Provider Setup
ChatGPT (OpenAI)
BetterTranslate.configure do |config|
config.provider = :chatgpt
config.openai_key = ENV["OPENAI_API_KEY"]
# Optional: customize model settings (defaults shown)
config.request_timeout = 30 # seconds
config.max_retries = 3
config.retry_delay = 2.0 # seconds
# π v1.1.0: Provider-specific options
config.model = "gpt-5-nano" # Specify model (optional)
config.temperature = 0.3 # Creativity (0.0-2.0, default: 0.3)
config.max_tokens = 2000 # Response length limit
endGet your API key from OpenAI Platform.
Google Gemini
BetterTranslate.configure do |config|
config.provider = :gemini
config.google_gemini_key = ENV["GOOGLE_GEMINI_API_KEY"]
# Same optional settings as ChatGPT
config.request_timeout = 30
config.max_retries = 3
endGet your API key from Google AI Studio.
Anthropic Claude
BetterTranslate.configure do |config|
config.provider = :anthropic
config.anthropic_key = ENV["ANTHROPIC_API_KEY"]
# Same optional settings
config.request_timeout = 30
config.max_retries = 3
endGet your API key from Anthropic Console.
New Features (v1.1.0)
Automatic Backups
Protect your translation files with automatic backup creation:
config.create_backup = true # Enable backups (default: true)
config.max_backups = 5 # Keep up to 5 backup versionsBackup files are created with rotation:
- First backup:
it.yml.bak - Second backup:
it.yml.bak.1 - Third backup:
it.yml.bak.2 - Older backups are automatically deleted
JSON File Support
Translate JSON locale files for modern JavaScript frameworks:
# Automatically detects JSON format from file extension
config.input_file = "config/locales/en.json"
config.output_folder = "config/locales"
# All features work with JSON: backups, incremental mode, exclusions, etc.Example JSON file:
{
"en": {
"common": {
"greeting": "Hello %{name}"
}
}
}Parallel Translation
Translate multiple languages concurrently for faster processing:
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "es", name: "Spanish" },
{ short_name: "de", name: "German" }
]
config.max_concurrent_requests = 4 # Translate 4 languages at oncePerformance improvement: With 4 languages and max_concurrent_requests = 4, translation time is reduced by ~75% compared to sequential processing.
Multiple Files Support
Translate multiple files in a single run:
# Array of specific files
config.input_files = [
"config/locales/common.en.yml",
"config/locales/errors.en.yml",
"config/locales/admin.en.yml"
]
# Or use glob patterns (recommended)
config.input_files = "config/locales/**/*.en.yml"
# Or combine both approaches
config.input_files = [
"config/locales/**/*.en.yml",
"app/javascript/translations/*.en.json"
]Output files preserve the original structure:
-
common.en.ymlβcommon.it.yml -
errors.en.ymlβerrors.it.yml -
admin/settings.en.ymlβadmin/settings.it.yml
Language Configuration
config.source_language = "en" # ISO 639-1 code (2 letters)
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "de", name: "German" },
{ short_name: "es", name: "Spanish" },
{ short_name: "pt", name: "Portuguese" },
{ short_name: "ja", name: "Japanese" },
{ short_name: "zh", name: "Chinese" }
]File Paths
config.input_file = "config/locales/en.yml" # Source file
config.output_folder = "config/locales" # Output directoryNote (v1.1.1+): If the input file doesn't exist, it will be automatically created with a minimal valid structure (e.g., { "en": {} }).
π¨ Features in Detail
Translation Modes
Override Mode (Default)
Replaces the entire target file with fresh translations:
config.translation_mode = :override # defaultUse when: Starting fresh or regenerating all translations.
Incremental Mode
Merges with existing translations, only translating missing keys:
config.translation_mode = :incrementalUse when: Preserving manual corrections or adding new keys to existing translations.
Caching System
The LRU (Least Recently Used) cache stores translations to reduce API costs:
config.cache_enabled = true # default: true
config.cache_size = 1000 # default: 1000 items
config.cache_ttl = 3600 # optional: 1 hour in seconds (nil = no expiration)Cache key format: "#{text}:#{target_lang_code}"
Benefits:
- Reduces API costs for repeated translations
- Speeds up re-runs during development
- Thread-safe with Mutex protection
Rate Limiting
Prevent API overload with built-in rate limiting:
config.max_concurrent_requests = 3 # default: 3The rate limiter enforces a 0.5-second delay between requests by default. This is handled automatically by the BaseHttpProvider.
Exclusion System
Global Exclusions
Keys excluded from translation in all target languages (useful for brand names, product codes, etc.):
config.global_exclusions = [
"app.name", # "MyApp" should never be translated
"app.company", # "ACME Inc." stays the same
"product.sku" # "SKU-12345" is language-agnostic
]Language-Specific Exclusions
Keys excluded only for specific languages (useful for manually translated legal text, locale-specific content, etc.):
config.exclusions_per_language = {
"it" => ["legal.terms", "legal.privacy"], # Italian legal text manually reviewed
"de" => ["legal.terms", "legal.privacy"], # German legal text manually reviewed
"fr" => ["marketing.slogan"] # French slogan crafted by marketing team
}Example:
-
legal.termsis translated for Spanish, Portuguese, etc. - But excluded for Italian and German (already manually translated)
Translation Context
Provide domain-specific context to improve translation accuracy:
config.translation_context = "Medical terminology for healthcare applications"This context is included in the AI system prompt, helping with specialized terminology in fields like:
- π₯ Medical/Healthcare: "patient", "diagnosis", "treatment"
- βοΈ Legal: "plaintiff", "defendant", "liability"
- π° Financial: "dividend", "amortization", "escrow"
- π E-commerce: "checkout", "cart", "inventory"
- π§ Technical: "API", "endpoint", "authentication"
Translation Strategies
BetterTranslate automatically selects the optimal strategy based on content size:
Deep Translation (< 50 strings)
- Translates each string individually
- Detailed progress tracking
- Best for small to medium files
Batch Translation (β₯ 50 strings)
- Processes in batches of 10 strings
- Faster for large files
- Reduced API overhead
You don't need to configure this - it's automatic! π―
π§ Rails Integration
BetterTranslate provides three Rails generators:
1. Install Generator
Generate the initializer with example configuration:
rails generate better_translate:installCreates: config/initializers/better_translate.rb
2. Translate Generator
Run the translation process:
rails generate better_translate:translateThis triggers the translation based on your configuration and displays progress messages.
Note (v1.1.1+): The generator now prioritizes configuration from config/initializers/better_translate.rb over YAML config files. If no configuration is found, it provides helpful error messages suggesting both configuration methods.
3. Analyze Generator
Analyze translation similarities using Levenshtein distance:
rails generate better_translate:analyzeOutput:
- Console summary with similar translation pairs
- Detailed JSON report:
tmp/translation_similarity_report.json - Human-readable summary:
tmp/translation_similarity_summary.txt
Use cases:
- Identify potential translation inconsistencies
- Find duplicate or near-duplicate translations
- Quality assurance for translation output
π Advanced Usage
Programmatic Translation
Translate Multiple Texts to Multiple Languages
texts = ["Hello", "Goodbye", "Thank you"]
target_langs = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" }
]
results = BetterTranslate::TranslationHelper.translate_texts_to_languages(texts, target_langs)
# Results structure:
# {
# "it" => ["Ciao", "Arrivederci", "Grazie"],
# "fr" => ["Bonjour", "Au revoir", "Merci"]
# }Translate Single Text to Multiple Languages
text = "Welcome to our application"
target_langs = [
{ short_name: "it", name: "Italian" },
{ short_name: "es", name: "Spanish" }
]
results = BetterTranslate::TranslationHelper.translate_text_to_languages(text, target_langs)
# Results:
# {
# "it" => "Benvenuto nella nostra applicazione",
# "es" => "Bienvenido a nuestra aplicaciΓ³n"
# }Custom Configuration for Specific Tasks
# Separate configuration for different domains
medical_config = BetterTranslate::Configuration.new
medical_config.provider = :chatgpt
medical_config.openai_key = ENV["OPENAI_API_KEY"]
medical_config.translation_context = "Medical terminology for patient records"
medical_config.validate!
# Use the custom config...Dry Run Mode
Test your configuration without writing files:
config.dry_run = trueThis validates everything and simulates the translation process without creating output files.
Verbose Logging
Enable detailed logging for debugging:
config.verbose = trueπ Orphan Key Analyzer
The Orphan Key Analyzer helps you find unused translation keys in your codebase. It scans your YAML locale files and compares them against your actual code usage, generating comprehensive reports.
CLI Usage
Find orphan keys from the command line:
# Basic text report (default)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/
# JSON format (great for CI/CD)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--format json
# CSV format (easy to share with team)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--format csv
# Save to file
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--output orphan_report.txtSample Output
Text format:
============================================================
Orphan Keys Analysis Report
============================================================
Statistics:
Total keys: 50
Used keys: 45
Orphan keys: 5
Usage: 90.0%
Orphan Keys (5):
------------------------------------------------------------
Key: users.old_message
Value: This feature was removed
Key: products.deprecated_label
Value: Old Label
...
============================================================
JSON format:
{
"orphans": ["users.old_message", "products.deprecated_label"],
"orphan_details": {
"users.old_message": "This feature was removed",
"products.deprecated_label": "Old Label"
},
"orphan_count": 5,
"total_keys": 50,
"used_keys": 45,
"usage_percentage": 90.0
}Programmatic Usage
Use the analyzer in your Ruby code:
# Scan YAML file
key_scanner = BetterTranslate::Analyzer::KeyScanner.new("config/locales/en.yml")
all_keys = key_scanner.scan # Returns Hash of all keys
# Scan code for used keys
code_scanner = BetterTranslate::Analyzer::CodeScanner.new("app/")
used_keys = code_scanner.scan # Returns Set of used keys
# Detect orphans
detector = BetterTranslate::Analyzer::OrphanDetector.new(all_keys, used_keys)
orphans = detector.detect
# Get statistics
puts "Orphan count: #{detector.orphan_count}"
puts "Usage: #{detector.usage_percentage}%"
# Generate report
reporter = BetterTranslate::Analyzer::Reporter.new(
orphans: orphans,
orphan_details: detector.orphan_details,
total_keys: all_keys.size,
used_keys: used_keys.size,
usage_percentage: detector.usage_percentage,
format: :text
)
puts reporter.generate
reporter.save_to_file("orphan_report.txt")Supported Translation Patterns
The analyzer recognizes these i18n patterns:
-
t('key')- Rails short form -
t("key")- Rails short form with double quotes -
I18n.t(:key)- Symbol syntax -
I18n.t('key')- String syntax -
I18n.translate('key')- Full method name -
<%= t('key') %>- ERB templates -
I18n.t('key', param: value)- With parameters
Nested keys:
en:
users:
profile:
title: "Profile" # Detected as: users.profile.titleUse cases:
- Clean up unused translations before deployment
- Identify dead code after refactoring
- Reduce locale file size
- Improve translation maintenance
- Generate reports for translation teams
π§ͺ Development & Testing
BetterTranslate includes comprehensive testing infrastructure with unit tests, integration tests, and a Rails dummy app for realistic testing.
Test Structure
spec/
βββ better_translate/ # Unit tests (fast, no API calls)
β βββ cache_spec.rb
β βββ configuration_spec.rb
β βββ providers/
β β βββ chatgpt_provider_spec.rb
β β βββ gemini_provider_spec.rb
β βββ ...
β
βββ integration/ # Integration tests (real API via VCR)
β βββ chatgpt_integration_spec.rb
β βββ gemini_integration_spec.rb
β βββ rails_dummy_app_spec.rb
β βββ README.md
β
βββ dummy/ # Rails dummy app for testing
β βββ config/
β β βββ locales/
β β βββ en.yml # Source file
β β βββ it.yml # Generated translations
β β βββ fr.yml
β βββ demo_translation.rb # Interactive demo script
β βββ USAGE_GUIDE.md
β
βββ vcr_cassettes/ # Recorded API responses (18 cassettes, 260KB)
βββ chatgpt/ (7)
βββ gemini/ (7)
βββ rails/ (4)
Running Tests
# Run all tests (unit + integration)
bundle exec rake spec
# or
bundle exec rspec
# Run only unit tests (fast, no API calls)
bundle exec rspec spec/better_translate/
# Run only integration tests (uses VCR cassettes)
bundle exec rspec spec/integration/
# Run specific test file
bundle exec rspec spec/better_translate/configuration_spec.rb
# Run tests with coverage
bundle exec rspec --format documentationVCR Cassettes & API Testing
BetterTranslate uses VCR (Video Cassette Recorder) to record real API interactions for integration tests. This allows:
β Realistic testing with actual provider responses β No API keys needed after initial recording β Fast test execution (no real API calls) β CI/CD friendly (cassettes committed to repo) β API keys anonymized (safe to commit)
Setup API Keys for Recording
# Copy environment template
cp .env.example .env
# Edit .env and add your API keys
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=sk-ant-...Re-record Cassettes
# Delete and re-record all cassettes
rm -rf spec/vcr_cassettes/
bundle exec rspec spec/integration/
# Re-record specific provider
rm -rf spec/vcr_cassettes/chatgpt/
bundle exec rspec spec/integration/chatgpt_integration_spec.rbNote: The .env file is gitignored. API keys in cassettes are automatically replaced with <OPENAI_API_KEY>, <GEMINI_API_KEY>, etc.
Rails Dummy App Demo
Test BetterTranslate with a realistic Rails app:
# Run interactive demo
ruby spec/dummy/demo_translation.rbOutput:
π Starting translation...
[BetterTranslate] Italian | hello | 6.3%
[BetterTranslate] Italian | world | 12.5%
...
β
Success: 2 language(s)
β it.yml generated (519 bytes)
β fr.yml generated (511 bytes)
Generated files:
-
spec/dummy/config/locales/it.yml- Italian translation -
spec/dummy/config/locales/fr.yml- French translation
See spec/dummy/USAGE_GUIDE.md for more examples.
Code Quality
# Run RuboCop linter
bundle exec rubocop
# Auto-fix violations
bundle exec rubocop -a
# Run both tests and linter
bundle exec rakeDocumentation
# Generate YARD documentation
bundle exec yard doc
# Start documentation server (http://localhost:8808)
bundle exec yard server
# Check documentation coverage
bundle exec yard statsInteractive Console
# Load the gem in an interactive console
bin/consoleSecurity Audit
# Check for security vulnerabilities
bundle exec bundler-audit check --updateποΈ Architecture
Provider Architecture
All providers inherit from BaseHttpProvider:
BaseHttpProvider (abstract)
βββ ChatGPTProvider
βββ GeminiProvider
βββ AnthropicProvider
BaseHttpProvider responsibilities:
- HTTP communication via Faraday
- Retry logic with exponential backoff
- Rate limiting
- Timeout handling
- Error wrapping
Core Components
- Configuration: Type-safe config with validation
- Cache: LRU cache with optional TTL
- RateLimiter: Thread-safe request throttling
- Validator: Input validation (language codes, text, paths, keys)
- HashFlattener: Converts nested YAML β flat structure
Error Hierarchy
All errors inherit from BetterTranslate::Error:
BetterTranslate::Error
βββ ConfigurationError
βββ ValidationError
βββ TranslationError
βββ ProviderError
βββ ApiError
βββ RateLimitError
βββ FileError
βββ YamlError
βββ ProviderNotFoundError
π Documentation
- USAGE_GUIDE.md - Complete guide to dummy app and demos
- VCR Testing Guide - How to test with VCR cassettes
- CLAUDE.md - Developer guide for AI assistants (Claude Code)
- YARD Docs - Complete API documentation
Key Documentation Files
better_translate/
βββ README.md # This file (main documentation)
βββ CLAUDE.md # Development guide (commands, architecture)
βββ spec/
β βββ dummy/
β β βββ USAGE_GUIDE.md # π Interactive demo guide
β β βββ demo_translation.rb # π Runnable demo script
β βββ integration/
β βββ README.md # π§ͺ VCR testing guide
βββ docs/
βββ implementation/ # Design docs
π€ Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/alessiobussolari/better_translate.
Development Guidelines
- TDD (Test-Driven Development): Always write tests before implementing features
-
YARD Documentation: Document all public methods with
@param,@return,@raise, and@example -
RuboCop Compliance: Ensure code passes
bundle exec rubocopbefore committing -
Frozen String Literals: Include
# frozen_string_literal: trueat the top of all files - HTTP Client: Use Faraday for all HTTP requests (never Net::HTTP or HTTParty)
- VCR Cassettes: Record integration tests with real API responses for CI/CD
Development Workflow
# 1. Clone and setup
git clone https://github.com/alessiobussolari/better_translate.git
cd better_translate
bundle install
# 2. Create a feature branch
git checkout -b my-feature
# 3. Write tests first (TDD)
# Edit spec/better_translate/my_feature_spec.rb
# 4. Implement the feature
# Edit lib/better_translate/my_feature.rb
# 5. Ensure tests pass and code is clean
bundle exec rspec
bundle exec rubocop
# 6. Commit and push
git add .
git commit -m "Add my feature"
git push origin my-feature
# 7. Create a Pull RequestRelease Workflow
Releases are automated via GitHub Actions:
# 1. Update version
vim lib/better_translate/version.rb # VERSION = "1.0.1"
# 2. Update CHANGELOG
vim CHANGELOG.md
# 3. Commit and tag
git add -A
git commit -m "chore: Release v1.0.1"
git tag v1.0.1
git push origin main
git push origin v1.0.1
# 4. GitHub Actions automatically:
# β
Runs tests
# β
Builds gem
# β
Publishes to RubyGems.org
# β
Creates GitHub ReleaseSetup: See .github/RUBYGEMS_SETUP.md for configuring RubyGems trusted publishing (no API keys needed!).
π License
The gem is available as open source under the terms of the MIT License.
π Code of Conduct
Everyone interacting in the BetterTranslate project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
Made with β€οΈ by Alessio Bussolari