Project

scout-ai

0.0
The project is in a healthy, maintained state
assorted functionalities to help scouts use AI
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Runtime

 Project Readme

scout-ai

Agentic AI and machine‑learning for Scout: a compact layer to train/evaluate models (Ruby, Python/PyTorch, Hugging Face), talk to LLMs across multiple backends, wire Workflow tasks as tools, and build persistent, declarative conversations and agents.

This package sits on top of the Scout stack:

  • scout-essentials — low level functionality (Open, TSV, Persist, Path, ConcurrentStream, Log, etc.)
  • scout-gear — core data modules (TSV, KnowledgeBase, Entity, Association, Workflow, WorkQueue, etc.)
  • scout-rig — language bridges (notably Python via PyCall)
  • scout-camp — remote servers, cloud deployments, web interfaces
  • scout-ai — LLMs, agents and model wrappers (this repository)

All packages are available under github.com/mikisvaz:

Scout originates from the Rbbt ecosystem (bioinformatics workflows). Numerous end‑to‑end examples live in the Rbbt‑Workflows organization:

The sections below summarize the main components (LLM, Chat, Agent, Model), quick starts, and the command‑line interface. For full APIs, see the doc/ directory.

  • doc/LLM.md — multi‑backend LLM orchestration, tool calling, embeddings
  • doc/Chat.md — conversation builder/serializer
  • doc/Agent.md — stateful agents wired to Workflows and KnowledgeBases
  • doc/Model.md — model wrappers (ScoutModel, Python/Torch/Hugging Face)

Installation and requirements

Scout is a Ruby framework. Add scout-ai (and the other packages you need) to your project and require as needed.

  • Ruby 3.x recommended
  • For Python‑backed models (Torch/Hugging Face):
    • Python 3 (installed and visible in PATH)
    • pycall gem (Ruby ↔ Python bridge)
    • Python packages: torch, transformers, numpy, pandas (as needed)
  • For OpenAI or similar backends: set API keys in environment or config (see LLM backend docs)

Typical Gemfile fragment:

gem 'scout-essentials', git: 'https://github.com/mikisvaz/scout-essentials'
gem 'scout-gear',       git: 'https://github.com/mikisvaz/scout-gear'
gem 'scout-rig',        git: 'https://github.com/mikisvaz/scout-rig'
gem 'scout-ai',         git: 'https://github.com/mikisvaz/scout-ai'

Backends and endpoints can be configured under Scout.etc.AI/.yaml (merged into asks), or via environment variables per backend (see doc/LLM.md).

Quick starts

Ask a model

require 'scout-ai'
answer = LLM.ask "What is the capital of France?", backend: :openai, model: "gpt-4.1-mini"
puts answer

Chat builder:

chat = Chat.setup []
chat.system "You are a terse assistant"
chat.user   "List three colors"
puts chat.ask

Tool calling with a Workflow

Export Workflow tasks as callable tools—let the model call them functionally.

require 'scout-gear'  # defines Workflow

m = Module.new do
  extend Workflow
  self.name = "Registration"

  input :name, :string
  input :age, :integer
  input :gender, :select, nil, select_options: %w(male female)
  task :person => :yaml do inputs.to_hash end
end

puts LLM.workflow_ask(m, "Register Eduard Smith, a 25 yo male, using a tool call",
                      backend: 'ollama', model: 'llama3')

Stateful agent with a KnowledgeBase

require 'scout-gear'  # defines KnowledgeBase

TmpFile.with_dir do |dir|
  kb = KnowledgeBase.new dir
  kb.register :brothers, datafile_test(:person).brothers, undirected: true
  kb.register :marriages, datafile_test(:person).marriages,
             undirected: true, source: "=>Alias", target: "=>Alias"
  kb.register :parents, datafile_test(:person).parents

  agent = LLM::Agent.new knowledge_base: kb
  puts agent.ask "Who is Miki's brother in law?"
end

Structured iteration

agent = LLM::Agent.new
agent.iterate("List three steps to bake bread") { |step| puts "- #{step}" }

agent.iterate_dictionary("Give capital cities for FR, ES, IT") do |country, capital|
  puts "#{country}: #{capital}"
end

Use a Hugging Face classifier inside a Workflow

From the ExTRI2 workflow (see below):

model = HuggingfaceModel.new 'SequenceClassification', tri_model_dir, nil,
  tokenizer_args: { model_max_length: 512, truncation: true },
  return_logits: true

model.extract_features do |_, rows|
  rows.map do |text, tf, tg|
    text.sub("[TF]", "<TF>#{tf}</TF>").sub("[TG]", "<TG>#{tg}</TG>")
  end
end

model.init
preds = model.eval_list tsv.slice(%w(Text TF Gene)).values
tsv.add_field "Valid score" do
  non_valid, valid = preds.shift
  Misc.softmax([valid, non_valid]).first rescue 0
end

Components overview

LLM (doc/LLM.md)

A compact, multi‑backend layer to ask LLMs, wire function‑calling tools, parse/print chats, and compute embeddings.

  • ask(question, options={}, &block) — normalize a question to messages (LLM.chat), merge endpoint/model/format, run backend, and return assistant output (or messages with return_messages: true)
  • Backends: OpenAI‑style, Responses (multimodal, JSON schema), Ollama, OpenWebUI, AWS Bedrock, and a simple Relay
  • Tools: export Workflow tasks (LLM.workflow_tools) and KnowledgeBase lookups; tool calls are handled via a block
  • Embeddings and a tiny RAG helper
  • Chat/print pipeline: imports, clean, tasks/jobs as function calls, files/directories as tagged content
  • Configuration: endpoint defaults in Scout.etc.AI/endpoint.yaml are merged into options automatically

Chat (doc/Chat.md)

A lightweight builder over an Array of {role:, content:} messages with helpers:

  • user/system/assistant, file/directory tagging, import/continue
  • tool/workflow task declarations, jobs/inline jobs
  • association declarations (KnowledgeBase)
  • option, endpoint, model, format (including JSON schema requests)
  • ask, chat, json/json_format, print/save/write/write_answer, branch/shed

Use Chat to author “chat files” on disk or build conversations programmatically.

Agent (doc/Agent.md)

A thin orchestrator around Chat and LLM that keeps state and injects tools:

  • Maintains a live conversation (start_chat, start, current_chat)
  • Auto‑exports Workflow tasks and a KnowledgeBase traversal tool
  • ask/chat/json/iterate helpers; structured iteration over lists/dictionaries
  • load_from_path(dir) — bootstrap from a directory containing workflow.rb, knowledge_base, start_chat

Model (doc/Model.md)

A composable framework to wrap models with a consistent API:

  • ScoutModel — base: define init/eval/eval_list/extract_features/post_process/train; persist behavior and state to a directory
  • PythonModel — initialize and drive a Python class via ScoutPython
  • TorchModel — helpers for PyTorch: training loop, tensors, save/load state, layer introspection
  • HuggingfaceModel — Transformers convenience; specializations:
    • SequenceClassificationModel — text classification, logits→labels
    • CausalModel — chat/causal generation (supports apply_chat_template)
    • NextTokenModel — simple next‑token fine‑tuning loop

Pattern:

  • Keep feature extraction separate from evaluation
  • Use eval_list to batch large tables
  • Persist directory state and behavior to reuse

Example: ExTRI2 workflow (models in practice)

The ExTRI2 Workflow (Rbbt‑Workflows) uses HuggingfaceModel to score TRI sentences and determine Mode of Regulation (MoR):

  • Feature extraction marks [TF]/[TG] spans as inline tags for the model
  • Batch evaluation over a TSV (“Text”, “TF”, “Gene” columns)
  • Adds fields “Valid score” and “Valid” to the TSV
  • Runs a second SequenceClassification model to produce “MoR” and “MoR scores”

See workflows/ExTRI2/workflow.rb in that repository for the full implementation.

Command‑Line Interface

The bin/scout dispatcher locates scripts under scout_commands across installed packages and workflows using the Path subsystem. Resolution works by adding terms until a file is found to execute:

  • If the fragment maps to a directory, a listing of available subcommands is shown
  • Scripts can be nested arbitrarily (e.g., agent/kb)
  • Other packages or workflows can define their own scripts under share/scout_commands, and bin/scout will find them

scout llm …

Ask an LLM, manage chat files, run a minimal web UI, or process queued requests. Scripts live under scout_commands/llm.

  • Ask

    • scout llm ask [options] [question]
      • -t|--template <file_or_key> — load a prompt template; substitutes “???” or appends
      • -c|--chat <chat_file> — load/extend a conversation (appends the reply)
      • -i|--inline — answer “# ask: …” directives inline in a source file
      • -f|--file — prepend file content or substitute where “...” appears
      • -m|--model, -e|--endpoint, -b|--backend — select backend/model; merged with Scout.etc.AI
      • -d|--dry_run — expand and print the conversation (no ask)
  • Relay processor (for the Relay backend)

    • scout llm process [directory] — watches a queue directory and answers ask JSONs
  • Web UI server

    • scout llm server — static chat UI over ./chats with a small JSON API
  • Templates

    • scout llm template — list installed prompt templates (Scout.questions)

Run “scout llm” alone to see available subcommands. If you target a directory (e.g., “scout llm”), a help‑like listing is printed.

scout agent …

Stateful agents with Workflow and KnowledgeBase tooled up. Scripts live under scout_commands/agent.

  • Ask via an Agent

    • scout agent ask [options] [agent_name] [question]
      • -l|--log — set log severity
      • -t|--template <file_or_key>
      • -c|--chat <chat_file>
      • -m|--model, -e|--endpoint
      • -f|--file
      • -wt|--workflow_tasks <comma_list> — export only selected tasks
      • agent_name resolves via Scout.workflows[agent_name] (a workflow) or Scout.chats[agent_name] (an agent directory with workflow.rb/knowledge_base/start_chat)
  • KnowledgeBase passthrough

    • scout agent kb <agent_name>
      • Loads the agent’s knowledge base and forwards to “scout kb …” (see scout-gear doc/KnowledgeBase.md for kb CLI)

As with other Scout CLIs, if you target a directory of commands (e.g., “scout agent”), bin/scout will show the subcommand listing.

Note: Workflows also have extensive CLI commands (scout workflow …) for job execution, provenance, orchestration, and queue processing. When you integrate models inside tasks, you drive them through the workflow CLI (see scout-gear doc/Workflow.md).

Configuration, persistence and reproducibility

  • Endpoint presets: place YAML under Scout.etc.AI/.yaml to preconfigure URLs, models, headers, etc.; CLI options and chat inline options override defaults
  • Tool calling: Workflow tasks are exported as JSON schemas per backend; results are serialized back to the model as tool replies
  • Caching: LLM.ask persists responses (by default) using Persist.persist; disable with persist: false
  • Models: pass a directory to persist options/behavior/state (Torch/HF use state files or save_pretrained directories); save/restore to reuse
  • Chats: save printable conversations with Chat#save; reuse with “scout llm ask -c ”

For Python models, ensure scout-rig (ScoutPython) is installed and Python packages are present. See doc/Python.md in scout-rig for details.

Where to go next

License and contributions

Issues and PRs are welcome across the Scout repositories. Please open tickets in the relevant package (e.g., scout-ai for LLM/Agent/Model topics).