Project

ocrb

0.0
A long-lived project that still receives updates
OCR
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

~> 0.3.0
 Project Readme

Ocrb

OCR

Installation

Install the gem and add to the application's Gemfile by executing:

bundle add ocrb

If bundler is not being used to manage dependencies, install the gem by executing:

gem install ocrb

Usage

Require the gem and call Ocrb.run with an image path and prompt:

require "ocrb"

text = Ocrb.run("receipt.jpg", "Extract the text from this image.")
puts text

By default, Ocrb.run uses the ollama CLI with the glm-ocr:bf16 model:

Ocrb.run("receipt.jpg", "Summarize the line items.")

If you want to use an OpenAI-compatible API instead, pass the built-in extractor explicitly:

require "ocrb"

text = Ocrb.run(
  "receipt.jpg",
  "Recognize total amount.",
  extractor: Ocrb::Extractors::OpenAi.new(
    url: "http://127.0.0.1:1234/v1",
    model: "zai-org/glm-4.6v-flash",
    api_key: ENV.fetch("OPENAI_API_KEY", "asdf"),
    json: {type: 'object', properties: {amount: {type: 'string'}}} # can be `nil` or `true` or `response_format.json_schema.schema`
  )
)

Or Auge:

require "ocrb"

text = Ocrb.run(
  "receipt.jpg",
  "zh-Hans", # language, can be en/zh-Hans/ja... or mixed
  extractor: Ocrb::Extractors::Auge.new
)

You can also resize the image before OCR by passing a resizer:

require "ocrb"

text = Ocrb.run(
  "receipt.jpg",
  "Extract all visible text.",
  resizer: Ocrb::Resizers::Sips.new(resample_width: 1024)
)

Both extractor and resizer are duck-typed. Any object that responds to extract(image_path, prompt) or resize(image_path) can be passed in.

License

The gem is available as open source under the terms of the MIT License.