0.0
No release in over 3 years
LLM inference gateway: metering over RabbitMQ, fleet RPC dispatch, local disk spool
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

Runtime

 Project Readme

lex-llm-gateway

Version: 0.2.15 | License: MIT | Ruby: >= 3.4

Warning

lex-llm-gateway is legacy compatibility glue. New LegionIO LLM work should use legion-llm directly; the current legion-llm uplift owns in-tree metering, fleet transport, and inference routing through Legion::LLM.

LLM inference gateway for LegionIO. Provides centralized metering over RabbitMQ, fleet RPC dispatch to GPU workers, and local disk spooling for offline resilience.

Installation

Add to your Gemfile:

gem 'lex-llm-gateway'

Overview

lex-llm-gateway wraps all LLM calls with automatic metering and fleet routing. It is designed for clusters with 100k+ edge nodes that cannot have direct database access.

Three node roles:

Role What It Does
Publisher (all nodes) Calls Inference.chat which auto-meters to RMQ or disk spool
Fleet Worker (GPU nodes) Runs InferenceWorker actor, processes fleet requests
Metering Writer (DB nodes) Runs MeteringWriter actor, writes to metering_records

Degradation Ladder

Full stack (transport + gateway + LLM + fleet)
  no transport  -> spool to disk, flush when reconnected
  no gateway    -> Legion::LLM direct (no metering)
  no fleet      -> local/cloud only
  no cloud      -> local LLM only
  no local      -> error

Runners

  • Metering - build_event, publish_or_spool, flush_spool
  • Inference - chat, embed, structured (all auto-metered)
  • Fleet - dispatch to GPU workers with timeout and JWT auth
  • FleetHandler - handle_fleet_request (validates JWT, calls local LLM)
  • MeteringWriter - write_metering_record (DB insert consumed from RMQ)

Standalone Client

require 'legion/extensions/llm/gateway/client'

client = Legion::Extensions::Llm::Gateway::Client.new
result = client.chat(model: 'claude-opus-4-6', messages: [{ role: 'user', content: 'Hello' }])
result[:success]  # => true
result[:response] # => "Hello! How can I help you?"

Settings

{
  "llm": {
    "routing": {
      "use_fleet": true,
      "fleet": {
        "timeout_seconds": 30,
        "require_auth": false
      }
    }
  }
}

Requirements

  • Ruby >= 3.4
  • LegionIO framework
  • legion-transport (AMQP metering + inference queues)
  • legion-crypt (JWT signing for fleet auth, optional)
  • legion-data (MeteringWriter and disk spool, optional)
  • legion-llm (preferred inference, fleet, and metering owner for new work)

Development

bundle install
bundle exec rspec     # 199 examples, 0 failures
bundle exec rubocop   # 0 offenses

License

MIT