Project

rudachi-rb

0.0
No release in over a year
A Ruby wrapper for Sudachi.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 1.1.1
>= 1.4.0
 Project Readme

Rudachi-rb

Ruby wrapper for Sudachi.
(rudachi for Ruby)

Text

Rudachi::TextParser.parse('東京都へ行く')
=> "東京都\t名詞,固有名詞,地名,一般,*,*\t東京都\n\t助詞,格助詞,*,*,*,*\t\n行く\t動詞,非自立可能,*,*,五段-カ行,終止形-一般\t行く\nEOS\n"

File

File.open('input.txt', 'w') { |f| f << '東京都へ行く' }
Rudachi::FileParser.parse('input.txt')
=> "東京都\t名詞,固有名詞,地名,一般,*,*\t東京都\n\t助詞,格助詞,*,*,*,*\t\n行く\t動詞,非自立可能,*,*,五段-カ行,終止形-一般\t行く\nEOS\n"

IO

Rudachi::StreamParser.parse(StringIO.new('東京都へ行く'))
=> "東京都\t名詞,固有名詞,地名,一般,*,*\t東京都\n\t助詞,格助詞,*,*,*,*\t\n行く\t動詞,非自立可能,*,*,五段-カ行,終止形-一般\t行く\nEOS\n"
Rudachi::TextParser.new(o: 'output.txt', m: 'A').parse('東京都へ行く')
File.read('output.txt')
=> "東京\t名詞,固有名詞,地名,一般,*,*\t東京\n\t名詞,普通名詞,一般,*,*,*\t\n\t助詞,格助詞,*,*,*,*\t\n行く\t動詞,非自立可能,*,*,五段-カ行,終止形-一般\t行く\nEOS"

Requirements

For JRuby, please check rudachi.

Installation

  1. Install JAR and dictionary of Sudachi (Details)
Install the Sudachi JAR file
$ wget https://github.com/WorksApplications/Sudachi/releases/download/v0.5.3/sudachi-0.5.3-executable.zip
$ unzip sudachi-0.5.3-executable.zip
$ ls sudachi-0.5.3
LICENSE-2.0.txt  README.md  javax.json-1.1.jar	jdartsclone-1.2.0.jar  licenses  sudachi-0.5.3.jar  sudachi.json  sudachi_fulldict.json
Install the Sudachi dictionary
$ wget http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachidict/sudachi-dictionary-latest-full.zip
$ unzip -j -d sudachi-dictionary-latest-full sudachi-dictionary-latest-full.zip
$ mv sudachi-dictionary-latest-full/system_full.dic sudachi-dictionary-latest-full/system_core.dic
$ ls sudachi-dictionary-latest-full
LEGAL  LICENSE-2.0.txt	system_core.dic
  1. Install Rudachi
# Gemfile
gem 'rudachi-rb'

Then run bundle install.

  1. Initialize Rudachi
require 'rudachi/rb'

Rudachi.configure do |config|
  config.jar_path = 'sudachi-0.5.3/sudachi-0.5.3.jar'
end

Rudachi::Option.configure do |config|
  config.p = 'sudachi-dictionary-latest-full'
end
  1. Did it !!
Rudachi::TextParser.parse('こんにちは世界')
=> "こんにちは\t感動詞,一般,*,*,*,*\t今日は\n世界\t名詞,普通名詞,一般,*,*,*\t世界\nEOS\n"