0.0
No release in over 3 years
There's a lot of open issues
Pure Ruby language detection library using character n-gram frequency profiles. Detects 48 languages including European, Asian, and African languages with script-based fast-path and mixed-language segment detection.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 5.0
~> 13.0
 Project Readme

langdetect-ruby

Language detection for Ruby using n-gram profiles. Supports 10+ languages with script-based fast paths for CJK, Arabic, Thai, and Devanagari.

Installation

gem "langdetect-ruby"

Usage

require "lingua_ruby"

# Single detection
result = LinguaRuby.detect("This is an English sentence")
result.language  # => :en
result.confidence  # => 0.92

# Batch detection
results = LinguaRuby.detect_batch([
  "Hello world",
  "Halo dunia",
  "こんにちは世界"
])

# Restrict to specific languages
detector = LinguaRuby::Detector.new(languages: [:en, :id, :ms])
result = detector.detect("Selamat pagi")

Features

  • N-gram profile comparison with normalized confidence (0.0-1.0)
  • CJK/Arabic/Thai/Devanagari script fast-path detection
  • Short text mode (< 20 chars) with higher n-gram orders
  • Batch detection with single profile load
  • Indonesian/Malay/Sundanese differentiation
  • Input validation and error handling

License

MIT