0.0
No commit activity in last 3 years
No release in over 3 years
farsi_processor is a Ruby gem to process (stem and normalize) persian/farsi text
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.14
~> 10.0
~> 3.0
 Project Readme

Farsi Normalizer CircleCI

FarsiProcessor is a ruby gem to normalize and stem Persian/Farsi text

Normalization is defined as:

Stemming is defined as removing these suffixes (+ suffixes of plural form)

Installation

Add this line to your application's Gemfile:

gem "farsi_processor"

And then execute:

$ bundle

Or install it yourself as:

$ gem install farsi_processor

Usage

require 'farsi_processor'

[1] pry(main)> FarsiProcessor.process("ك")
=> "ک"

[2] pry(main)> FarsiProcessor.process("کتاب‌ ها")
=> "کتاب"

# it supports only and except options
[3] pry(main)> FarsiProcessor.process("ك ي", only: ["ك"])
=> "ک ي"

[4] pry(main)> FarsiProcessor.process("ك ي", except: ["ك"])
=> "ك ی"

[5] pry(main)> FarsiProcessor.process('دخترهای', except: ['های'])
=> "دختره"

# you can choose to just normalize or stem a word,
# they also support an only and except option
[6] pry(main)> FarsiProcessor.normalize("ك")
=> "ک"

[7] pry(main)> FarsiProcessor.stem("کتاب‌ ها")
=> "کتاب"

Questions or Problems?

If you have any issues with farsi_processor which you cannot find the solution, please add an issue on GitHub or fork the project and send a pull request.

License

The gem is available as open source under the terms of the MIT License.