ollama_agent
Version: 1.0.0
Ruby gem that runs a CLI coding agent against a local Ollama model. It exposes tools to list files, read files, search the tree (ripgrep or grep), and apply unified diffs so the model can make small, reviewable edits.
Contents
- Features
- Requirements
- Security and sandbox
- Installation
- Usage
- Skills
- Troubleshooting
- How it works
- Development
- License
Features
- Tool
list_files– list project files. - Tool
read_file– read file contents. - Tool
search_code– search code with ripgrep or grep. - Tool
edit_file– apply unified diffs safely. - CLI built with Thor, entry point
exe/ollama_agent. -
self_review– self-review / improvement with a--mode:-
analysis(default, alias1) — read-only tools; report only; no writes. -
interactive(alias2,fix) — full tools on--root; you confirm each patch (likeask); optional-y/--semi. -
automated(alias3,sandbox) — temp copy, agent edits,bundle exec rspecin the sandbox, optional--applyto merge into your checkout.
-
-
improve— same asself_review --mode automated(you can pass--mode automatedexplicitly; other modes belong onself_review). -
orchestrate/OLLAMA_AGENT_ORCHESTRATOR=1— optional orchestrator tools to probe and delegate to other local CLI agents (see Orchestrator);agentslists availability. -
Ruby API — embed
Runner,Agent, custom tools, hooks, sessions, and (optionally)ToolRuntime; see Library usage (Ruby).
Requirements
- Ruby ≥ 3.2 (enforced in the gemspec as
required_ruby_version) - Local: Ollama running and a capable tool-calling model, or
- Ollama Cloud: API key and a cloud-capable model name (see below)
Prerequisites (external tools)
-
patch— required foredit_file(GNUpatchonPATH). On Windows, use Git Bash, WSL, GnuWin32, or another environment that providespatch. -
rg(ripgrep) orgrep— text mode forsearch_codeneeds at least one of these onPATH(ripgrep is preferred when present).
Security and sandbox
-
Project root — File tools and search are constrained to the configured workspace (
--root/OLLAMA_AGENT_ROOT). Treat that directory as the trust boundary: only aim the agent at trees you are willing to modify. -
run_shell(optional tool) — Commands are parsed into an argument vector (no shell) and must match an allowlist; a denylist blocks obviously dangerous patterns. You can still shoot yourself in the foot with an allowed prefix (for examplegitwith destructive subcommands), so keep profiles and permissions tight in automated setups. -
Timeouts — Text search honors
OLLAMA_AGENT_SEARCH_TIMEOUT_SEC(default 120). Shell execution has its own per-invocation timeout. -
Logging — Budget, loop-detection, and
list_local_model_namesfailures go through Ruby’sLogger(stderr by default). SetOLLAMA_AGENT_LOG_LEVEL=debugorOLLAMA_AGENT_DEBUG=1for more detail.
Installation
From RubyGems (when published) or from this repository:
bundle installUsage
Default: run the gem with no subcommand to open the interactive TUI (same as ask with no query):
ollama_agent
# or from this repo:
bundle exec ruby exe/ollama_agentOther entry points are opt-in: pass a subcommand (self_review, sessions, …) or ask / orchestrate with a query for a one-shot task, or flags for a plain line REPL (see below).
From the project you want the agent to modify (set the working directory accordingly):
bundle exec ruby exe/ollama_agent ask "Update the README.md with current codebase"From this repository after bundle install, ruby exe/ollama_agent (without bundle exec) also works: the executable adds lib to the load path and loads bundler/setup when a Gemfile is present.
Apply proposed patches without interactive confirmation:
bundle exec ruby exe/ollama_agent ask -y "Your task"
# Review / audit only (no patches, writes, or delegation)—same as a report-style self_review
bundle exec ruby exe/ollama_agent ask --read-only "Summarize risks in this repo"Long-running models (slow local inference):
bundle exec ruby exe/ollama_agent ask --timeout 300 "Your task"Agent budget (steps, tokens, cost)
Each model round-trip that runs during a session counts as one step toward OLLAMA_AGENT_MAX_TURNS (default 64), enforced together with token and optional cost limits in OllamaAgent::Core::Budget. Exploratory tasks that list, read, and search across a large repository can burn through steps quickly; if you see budget exceeded — step limit (64), raise the limit—for example:
export OLLAMA_AGENT_MAX_TURNS=128
bundle exec ruby exe/ollama_agent ask "Your wide-ranging task"Narrower prompts, --read-only, or a smaller --root also reduce step usage. With OLLAMA_AGENT_DEBUG=1, the agent prints an extra hint when the maximum tool rounds for a run are reached.
search_code and regex patterns
In text mode, the tool passes your pattern to ripgrep (or grep). Patterns are regular expressions: literal parentheses, brackets, and unbalanced groups can trigger errors (for example unclosed group). Escape metacharacters or use fixed-string mode when your tool schema exposes it.
Plain line REPL (no TUI boxes / markdown shell): use ask (or orchestrate) with -i and without --tui—for example when you omit the query you must opt out of the default TUI this way:
bundle exec ruby exe/ollama_agent ask --interactive
# same idea: explicit -i, no --tuiSelf-review modes (default project root is the current working directory unless you set --root or OLLAMA_AGENT_ROOT):
# Mode 1 — analysis only (default)
bundle exec ruby exe/ollama_agent self_review
bundle exec ruby exe/ollama_agent self_review --mode analysis
# Mode 2 — optional fixes in the working tree (confirm each patch, or -y / --semi)
bundle exec ruby exe/ollama_agent self_review --mode interactive
# Mode 3 — sandbox + tests + optional merge back (same as `improve`)
# Without --apply, edits stay in a temp dir only; pass --apply to copy changed files into your checkout.
bundle exec ruby exe/ollama_agent self_review --mode automated
bundle exec ruby exe/ollama_agent self_review --mode automated --apply
bundle exec ruby exe/ollama_agent improve --applyruby_mastery (optional): When the ruby_mastery gem is installed (this repo lists it in the Gemfile for development), self_review (all modes) and improve prepend a markdown static-analysis section to the user prompt. Add the same gem to your app’s Gemfile if you want that behavior outside this checkout. Disable with --no-ruby-mastery or OLLAMA_AGENT_RUBY_MASTERY=0. Limit size with OLLAMA_AGENT_RUBY_MASTERY_MAX_CHARS (default 60000).
For mode 3, -y skips all patch prompts; --no-semi prompts for every patch when not using -y.
Reasoning / thinking output
On thinking-capable models, Ollama can return reasoning separately from the final answer (message.thinking vs message.content). The CLI labels them Thinking (dim) and Assistant (green / Markdown).
Enable think on the request
The agent sends Ollama’s think field only when you set it (CLI or env). If you omit it, the server uses its own defaults—and some models then omit or change reasoning in the response.
| You want | CLI | Environment |
|---|---|---|
| Reasoning on (typical Qwen / DeepSeek-style) | --think true |
OLLAMA_AGENT_THINK=true or 1
|
| Reasoning off | --think false |
OLLAMA_AGENT_THINK=false or 0
|
| GPT-OSS style levels |
--think low, medium, or high
|
OLLAMA_AGENT_THINK=medium (example) |
Examples:
OLLAMA_AGENT_THINK=true bundle exec ruby exe/ollama_agent ask -i
bundle exec ruby exe/ollama_agent ask -i --think true
# GPT-OSS: prefer a level, not only true/false
bundle exec ruby exe/ollama_agent ask --think medium "Your task"Streaming vs one-shot (default)
| Mode | Flags | What you see |
|---|---|---|
| One-shot (default) | neither --stream nor OLLAMA_AGENT_STREAM=1
|
Each model round completes over HTTP; Thinking / Assistant are printed from the assembled message (including Gemma-style reasoning tags stripped from content when the API omits thinking). |
| Streaming |
--stream or OLLAMA_AGENT_STREAM=1
|
Reasoning streams in dim text under one Thinking line, then Assistant and the reply stream—similar to Cursor. Uses hooks[:on_thinking] on the ollama-client chat stream (see OllamaAgent::OllamaChatThinkingStreamPatch). |
OLLAMA_AGENT_THINK=medium OLLAMA_AGENT_STREAM=1 bundle exec ruby exe/ollama_agent ask "Your task"Note: Subscribing only to on_thinking does not enable the streaming chat path; the agent uses streaming when something listens for on_token (the console streamer registers both). See CHANGELOG 1.0.0 if you embed the library.
Display style (TTY)
By default OLLAMA_AGENT_THINKING_STYLE=compact: one Thinking header per ask run; later reasoning chunks in the same run are separated by blank lines only (including after tool rounds). OLLAMA_AGENT_THINKING_STYLE=framed repeats the full boxed banner per message. Thinking body is plain dim unless OLLAMA_AGENT_THINKING_MARKDOWN=1.
The CLI uses ANSI colors on a TTY (banner, prompt, patch prompts). Assistant replies use Markdown via tty-markdown when stdout is a TTY and NO_COLOR is unset. Disable Markdown with OLLAMA_AGENT_MARKDOWN=0; disable colors with NO_COLOR or OLLAMA_AGENT_COLOR=0.
If you see no Thinking block
-
Set
thinkexplicitly—especially for GPT-OSS (low/medium/high). -
Confirm the model returns
message.thinking(e.g.curl/ollamaCLI against/api/chatwith the samethinkvalue). If the API never sendsthinking, the agent has nothing to show. -
Try streaming (
--streamorOLLAMA_AGENT_STREAM=1) if you want live reasoning tokens. -
Embedded reasoning in
content: Some templates (e.g. Gemma) put tags such as<|channel>thought…<channel|>or<redacted_thinking>…</redacted_thinking>insidecontent. The agent strips those into Thinking when present (OllamaAgent::GemmaThoughtContentParser). If your model uses different delimiters, reasoning may stay inside the main reply until parsers are extended.
Ruby API
OllamaAgent::Runner.build(stream: true, think: "medium").run("Your task")Custom subscribers can attach to hooks[:on_thinking] and hooks[:on_token] on the same Runner instance (see OllamaAgent::Streaming::Hooks).
Ollama Cloud
Ollama Cloud uses the same HTTP API as the local server, with HTTPS and a Bearer API key. The ollama-client gem sends Authorization: Bearer <api_key> when Ollama::Config#api_key is set (HTTPS is used when the URL scheme is https).
- Create a key at ollama.com/settings/keys.
- Point the agent at the cloud host and pass the key (same env names as ollama-client’s docs):
export OLLAMA_BASE_URL="https://ollama.com"
export OLLAMA_API_KEY="your_key"
export OLLAMA_AGENT_MODEL="gpt-oss:120b-cloud" # example; pick a cloud model from `ollama list` / the catalog
# Reasoning for GPT-OSS: set a level (see "Reasoning / thinking output" above)
export OLLAMA_AGENT_THINK=medium
bundle exec ruby exe/ollama_agent ask "Your task"Environment
| Variable | Purpose |
|---|---|
OLLAMA_BASE_URL |
Ollama API base URL (default from ollama-client: http://localhost:11434; use https://ollama.com for cloud) |
OLLAMA_API_KEY |
API key for Ollama Cloud (https://ollama.com); optional for local HTTP |
OLLAMA_AGENT_MODEL |
Model name (overrides default from ollama-client) |
OLLAMA_AGENT_ROOT |
Project root for tools (list_files, read_file, etc.). Defaults to current working directory when unset (CLI never falls back to the gem install path). |
OLLAMA_AGENT_DEBUG |
Set to 1 to print validation diagnostics on stderr |
OLLAMA_AGENT_STRICT_ENV |
Set to 1 so invalid numeric env values (e.g. OLLAMA_AGENT_MAX_TURNS) raise ConfigurationError instead of falling back to defaults |
OLLAMA_AGENT_MAX_TURNS |
Max chat rounds with tool calls (default: 64) |
OLLAMA_AGENT_TIMEOUT |
HTTP read/open timeout in seconds for Ollama requests (default 120; use ask --timeout / -t to override per run) |
OLLAMA_AGENT_PARSE_TOOL_JSON |
Set to 1 to run tools parsed from JSON lines in assistant text (fallback when the model does not emit native tool calls) |
NO_COLOR |
Set (any value) to disable ANSI colors (see no-color.org) |
OLLAMA_AGENT_COLOR |
Set to 0 to disable colors even on a TTY |
OLLAMA_AGENT_MARKDOWN |
Set to 0 to disable Markdown formatting of assistant replies (plain text only) |
OLLAMA_AGENT_THINKING_STYLE |
compact (default) = one Thinking label per run, blank lines between later reasoning chunks; framed = repeat full banner/rulers each message |
OLLAMA_AGENT_THINKING_MARKDOWN |
Set to 1 to render thinking text with Markdown (muted); default is plain dim text |
OLLAMA_AGENT_STREAM |
Set to 1 to stream tokens and reasoning to stdout (same as CLI --stream on ask / self_review / improve). |
OLLAMA_AGENT_THINK |
Model thinking mode for compatible models: true / false, or high / medium / low (see ollama-client think:). Empty = omit (server default). GPT-OSS: use low / medium / high. |
OLLAMA_AGENT_PATCH_RISK_MAX_DIFF_LINES |
Max changed-line count before a diff is treated as "large" for semi-auto patch risk (default 80) |
OLLAMA_AGENT_INDEX_REBUILD |
Set to 1 to drop the cached Prism Ruby index before the next symbol search in this process |
OLLAMA_AGENT_RUBY_INDEX_MAX_FILES |
Max .rb files to parse per index build (default 5000) |
OLLAMA_AGENT_RUBY_INDEX_MAX_FILE_BYTES |
Skip Ruby files larger than this many bytes (default 512000) |
OLLAMA_AGENT_RUBY_INDEX_MAX_LINES |
Max result lines for search_code class/module/method modes (default 200) |
OLLAMA_AGENT_RUBY_INDEX_MAX_CHARS |
Max characters of index output per search (default 60000) |
OLLAMA_AGENT_MAX_READ_FILE_BYTES |
Max bytes for a full read_file (no line range); larger files return an error (default 2097152, 2 MiB). Line-range reads stream and are not limited by this cap. |
OLLAMA_AGENT_RG_PATH |
Absolute path to rg for search_code text mode (optional; otherwise first rg on PATH) |
OLLAMA_AGENT_GREP_PATH |
Absolute path to grep fallback (optional; otherwise first grep on PATH) |
OLLAMA_AGENT_INDEX_REBUILD |
The Prism index is rebuilt when this env value changes (e.g. unset → 1); it is not rebuilt on every tool call while it stays 1. |
OLLAMA_AGENT_SKILLS |
1/on/0/off — include bundled prompt skills (default on). Same as --no-skills on the CLI when off. |
OLLAMA_AGENT_SKILLS_INCLUDE |
Comma-separated manifest ids to load (omit = all bundled). Example: ruby_style,rubocop,code_review. |
OLLAMA_AGENT_SKILLS_EXCLUDE |
Comma-separated ids to skip from the bundled set. |
OLLAMA_AGENT_SKILL_PATHS |
Extra .md files or directories, colon-separated (Unix PATH style). Directory entries load all *.md in sorted order. Merged with --skill-paths. |
OLLAMA_AGENT_EXTERNAL_SKILLS |
1/0 — include content from OLLAMA_AGENT_SKILL_PATHS (default on). Set 0 to use bundled-only without unsetting paths. |
Prompt skills (bundled + optional paths)
The system prompt is the base agent instructions (AgentPrompt) plus optional Markdown sections. Bundled files live under lib/ollama_agent/prompt_skills/ and are listed in manifest.yml. Each file may use Cursor-style YAML frontmatter (--- … ---); the loader strips frontmatter before sending text to the model.
Manifest ids (in load order): clean_ruby, ruby_style, rubocop, solid, solid_ruby, design_patterns, rspec, rails_style, rails_best_practices, code_review, ollama_agent_patterns.
Bundled bodies were copied from Cursor SKILL.md files under ~/.cursor/skills/ (and ollama_agent_patterns from this repo’s .cursor/skills/ollama-agent-patterns). Re-copy when you update those skills upstream.
Many full skills can be large; use OLLAMA_AGENT_SKILLS_INCLUDE to trim for small-context models.
CLI flags (also available on ask, self_review, improve): --no-skills, --skill-paths 'path1:path2/dir'.
To run self_review / ask against the installed gem’s source (e.g. to hack on ollama_agent itself), pass an explicit root, for example --root "$(bundle show ollama_agent)" or a path to a git clone.
Orchestrator (external CLI agents)
Use the orchestrate command (or OLLAMA_AGENT_ORCHESTRATOR=1 with ask) to expose tools list_external_agents and delegate_to_agent. The Ollama model should gather context with read_file / search_code, list installed CLIs, then delegate a short task + context to an external agent (Claude Code, Gemini CLI, Codex, Cursor CLI, etc.). Definitions live in lib/ollama_agent/external_agents/default_agents.yml; override or extend via ~/.config/ollama_agent/agents.yml or OLLAMA_AGENT_EXTERNAL_AGENTS_CONFIG.
-
ollama_agent agents— print a table of configured agents and whether each binary is onPATH. -
ollama_agent doctor— alias foragents. -
delegate_to_agentruns a fixed argv (no shell) withcwd= project root; output is capped (OLLAMA_AGENT_DELEGATE_MAX_OUTPUT_BYTES, default 100k). Confirm each run unless-y. - Delegation audit logs: set
OLLAMA_AGENT_DELEGATE_LOG=1(orOLLAMA_AGENT_DEBUG=1) to emit a structured stderr line with agent id, argv, env keys (names only), exit code, and duration. - Adjust
argv/version_argvin YAML to match your real CLI (vendor flags differ). If a tool has no stable non-interactive mode, do not expose it in the registry. - Tool contract version:
OllamaAgent::ORCHESTRATOR_TOOLS_SCHEMA_VERSION.
Library usage (Ruby)
Most of this README is CLI-first (commands and environment variables above). The same capabilities exist as Ruby APIs—the Features list (file tools, self_review / improve, orchestrator, skills, etc.) is implemented under lib/ollama_agent/. For a layer diagram (agent → tools → hooks → session), see docs/ARCHITECTURE.md.
Coding agent — Runner (facade) — Stable entry for apps: OllamaAgent::Runner.build(root:, model:, stream:, session_id:, resume:, read_only:, orchestrator:, skills_enabled:, skill_paths:, audit:, max_tokens:, context_summarize:, stdin:, stdout:, ...) then #run(query). Optional stdin / stdout (default TTY) feed patch/write/delegate confirmations—use StringIO in tests or automation to avoid blocking on $stdin.gets. Exposes #hooks (Streaming::Hooks) for :on_token, :on_thinking (streamed reasoning when stream: true and the model supports it), :on_tool_call, :on_tool_result, :on_complete. Full keyword list: lib/ollama_agent/runner.rb.
Coding agent — Agent (direct) — OllamaAgent::Agent.new(client:, root:, ...) when you inject an Ollama::Client (or test double), tweak options the CLI does not expose, or skip Runner.
Custom tools (coding agent) — OllamaAgent::Tools.register("tool_name", schema: { ... }) { |args, root:, read_only:| ... } merges extra function definitions into the chat tool list; handlers run in the same sandbox as built-in tools.
Resilience and observability — Default client path uses Resilience::RetryMiddleware. Structured step logging: enable audit: true on Runner.build or OLLAMA_AGENT_AUDIT=1 (see Environment table). Context trimming: max_tokens / context_summarize on Runner.build.
Sessions — Pass session_id and optional resume: true on Runner.build to persist messages under .ollama_agent/sessions/ (Session::Store).
Self-improvement (sandbox) — CLI commands improve / self_review --mode automated wrap OllamaAgent::SelfImprovement (sandbox copy, tests, optional merge). Use the CLI for the full flow; the module is available for advanced integration.
ToolRuntime (alternate loop, optional) — Not used by the CLI. For non–file-edit agents (e.g. another gem that defines its own tools), a small JSON plan loop: the model returns one object per step {"tool":"name","args":{...}}, ToolRuntime::Registry resolves it, Executor runs your Tool subclasses, Memory holds short-term history. Use a swappable planner (anything implementing next_step(context:, memory:, registry:)) such as OllamaJsonPlanner (Ollama::Client#chat + JSON extraction). Step-by-step guide: docs/TOOL_RUNTIME.md.
-
Termination: a tool may return
{ "status" => "done" }to stop. Unknown tool names →OllamaAgent::ToolRuntime::InvalidPlanError; too many steps →MaxStepsExceeded.Loop#runreturns the last tool result (same value as the finalExecutor#executereturn). -
Runnable examples:
spec/ollama_agent/tool_runtime/.
Model and server: OllamaJsonPlanner uses the same default as the coding agent: OLLAMA_AGENT_MODEL if set, otherwise Ollama::Config.new.model (from ollama-client). The model must exist on whatever host you use. Use the same client setup as the CLI: OllamaAgent::OllamaConnection.apply_env_to_config copies OLLAMA_BASE_URL and OLLAMA_API_KEY into Ollama::Config. If you only run Ollama::Client.new(config: Ollama::Config.new) in irb, you stay on localhost while OLLAMA_AGENT_MODEL may still name a cloud model from the README cloud example → 404. Either apply apply_env_to_config (below) or unset the cloud model / pass model: "llama3.2".
require "ollama_agent"
require "ollama_client"
class EchoTool < OllamaAgent::ToolRuntime::Tool
def name = "echo"
def description = "Echo args"
def schema = { "type" => "object", "properties" => { "msg" => { "type" => "string" } } }
def call(args)
return { "status" => "done", "echo" => args["msg"] } if args["msg"] == "bye"
{ "status" => "ok", "echo" => args["msg"] }
end
end
registry = OllamaAgent::ToolRuntime::Registry.new([EchoTool.new])
memory = OllamaAgent::ToolRuntime::Memory.new
config = Ollama::Config.new
OllamaAgent::OllamaConnection.apply_env_to_config(config)
client = Ollama::Client.new(config: config)
planner = OllamaAgent::ToolRuntime::OllamaJsonPlanner.new(client: client)
last = OllamaAgent::ToolRuntime::Loop.new(
planner: planner,
registry: registry,
executor: OllamaAgent::ToolRuntime::Executor.new,
memory: memory,
max_steps: 10
).run(context: "Say hello then echo bye to finish.")
# last => e.g. { "status" => "done", "echo" => "bye" }Skills (deterministic JSON-contract pipelines)
Skills are single-purpose generators that bypass the tool-calling agent loop and return strict JSON validated against a schema. They are meant for pipelines that need predictable, parseable output — code review, refactoring suggestions, performance audits, debugging triage — without the unpredictability of free-form LLM prose.
Built-in skills:
-
architecture_refactor— restructure code without changing behavior -
performance_optimizer— identify bottlenecks and emit optimized code -
debug_engineer— root-cause a bug and propose a fix -
feature_builder— design and implement a production-ready feature
Each skill:
- Renders a deterministic prompt (LLM
temperature: 0by default). - Extracts the first balanced JSON object from the response (tolerates prose
and
```jsonfences). - Validates against the skill's
SCHEMAand raisesContractErroron mismatch.
CLI
# list registered skills
ollama_agent skill list
# run a single skill
ollama_agent skill run architecture_refactor --code-file lib/orders/manager.rb
# compose a pipeline; later skills receive earlier outputs merged in
ollama_agent skill pipeline architecture_refactor performance_optimizer \
--code-file lib/exit_management.rbOverride the model with --model, OLLAMA_AGENT_SKILL_MODEL, or
OLLAMA_AGENT_MODEL.
Ruby
result = OllamaAgent::Skills::ArchitectureRefactorer.new.call(
code: File.read("lib/orders/manager.rb")
)
# => { folder_structure: [...], architecture_notes: "...", refactored_code: "..." }
OllamaAgent::Skills::Runner.new(
[:architecture_refactor, :performance_optimizer]
).call(code: File.read("lib/exit_management.rb"))Inject your own LLM client (anything responding to #generate(prompt) → String)
in tests:
class FakeLlm
def generate(_prompt)
'{"bottlenecks": [], "optimizations": [], "optimized_code": "x"}'
end
end
OllamaAgent::Skills::PerformanceOptimizer.new(llm: FakeLlm.new).call(code: "...")By default skills run against the local Ollama provider (local-first, auditable).
They go through OllamaAgent::Providers::Registry, so any registered provider
(OpenAI, Anthropic, custom) is usable by passing your own LlmClient.
Troubleshooting
-
Use a tool-capable model — Set
OLLAMA_AGENT_MODELto a model that supports function/tool calling (e.g. a recent coder-tuned variant). If the model only prints{"name": "read_file", ...}in plain text, tools never run unless you enableOLLAMA_AGENT_PARSE_TOOL_JSON=1. -
Malformed diffs — Headers must look like
git diff:--- a/filethen+++ b/filethen a unified hunk line starting with@@(not legacy--- N,M ----). Do not put commas after path tokens. The gem normalizes some mistakes and runspatch --dry-runbefore applying. -
Request timeouts — The agent defaults to a 120s HTTP timeout (longer than ollama-client’s 30s). If you still hit
Ollama::TimeoutError, raise it withOLLAMA_AGENT_TIMEOUT=300,bundle exec ruby exe/ollama_agent ask --timeout 300 "...", or-t 300. Ensure the variable name is exactlyOLLAMA_AGENT_TIMEOUT(a leading typo such asvOLLAMA_AGENT_TIMEOUTis ignored).
How it works
- The CLI starts
OllamaAgent::Agent, which loops onOllama::Client#chatwith tool definitions. - Tools are executed in-process under a path sandbox (
OLLAMA_AGENT_ROOT). -
search_codedefaults to ripgrep/grep (modeomitted ortext). For Ruby, usemodemethod,class,module, orconstantto query a Prism parse index (built lazily on first use).read_fileaccepts optionalstart_line/end_line(1-based, inclusive) to read only part of a file. - Patches are validated and checked with
patch --dry-runbefore you confirm (unless-y).
Development
bundle exec rspec
bundle exec rubocopOngoing refactors (contributors): the Agent class is a thin façade over TurnLoop, ChatCoordinator, session/client wiring, and Tools::BuiltInSchemas so new behavior should land in those collaborators instead of growing monolithic methods. See CONTRIBUTING.md.
CI and RubyGems release
-
CI —
.github/workflows/main.ymlruns RSpec and RuboCop on pushes tomain/masterand on pull requests (Ruby 3.3.4 and 3.2.0). -
Release —
.github/workflows/release.ymlruns on tagsv*. It checks that the tag matchesOllamaAgent::VERSIONinlib/ollama_agent/version.rb, builds withgem build ollama_agent.gemspec, and pushes to RubyGems.
Repository secrets (Settings → Secrets and variables → Actions):
| Secret | Purpose |
|---|---|
RUBYGEMS_API_KEY |
RubyGems API key with push scope |
RUBYGEMS_OTP_SECRET |
Base32 secret for TOTP (RubyGems MFA); the workflow uses rotp to generate a one-time code for gem push
|
Release steps:
- Bump
OllamaAgent::VERSIONinlib/ollama_agent/version.rband commit tomain. - Tag:
git tag v1.0.0(must match the version string) andgit push origin v1.0.0.
License
MIT. See LICENSE.txt.