0.0
No release in over 3 years
A Ruby port of mlx-lm providing large language model inference, quantization, LoRA fine-tuning, and an OpenAI-compatible server built on the mlx gem. Supports Llama, Gemma, Qwen2, Phi3, Mixtral, DeepSeek, and many more architectures.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 5.20
~> 13.0

Runtime

>= 0.30.7.5, < 1.0
 Project Readme

mlx-ruby-lm

Tests Gem Version

Ruby LLM inference toolkit built on the mlx gem.

Index

For full reference pages and deep dives, start at docs/index.md.

Installation

gem install mlx-ruby-lm

Or add it to a project:

bundle add mlx-ruby-lm

See docs/installation.md for requirements and source installs.

CLI Usage

Executable: mlx_lm

Commands:

  • mlx_lm generate
  • mlx_lm chat
  • mlx_lm server

Quick examples:

mlx_lm generate --model /path/to/model --prompt "Hello"
mlx_lm chat --model /path/to/model --system-prompt "You are concise."
mlx_lm server --model /path/to/model --host 127.0.0.1 --port 8080

See docs/cli.md for options, defaults, and current parser/behavior caveats.

High-Level Ruby API Usage

require "mlx"
require "mlx_lm"

model, tokenizer = MlxLm::LoadUtils.load("/path/to/model")
text = MlxLm::Generate.generate(model, tokenizer, "Hello", max_tokens: 64)
puts text

Streaming:

MlxLm::Generate.stream_generate(model, tokenizer, "Hello", max_tokens: 64).each do |resp|
  print resp.text
end
puts

See docs/ruby-apis.md for the full API inventory.

High-Level Model Usage

LoadUtils.load expects a local model directory with files such as config.json, tokenizer.json, and model*.safetensors.

To inspect supported model keys at runtime:

require "mlx_lm"
puts MlxLm::Models::REGISTRY.keys.sort

See docs/models.md for full registry keys and remapping behavior.