AgentC
A small Ruby wrapper around RubyLLM that helps you write a pipeline of AI prompts and run it many times, in bulk. Built for automating repetitive refactors across a large codebase.
Most of what's below is generated by an LLM. I take no responsibility for any of it, unless it's awesome... then it was pure prompting skills which I will take credit for.
Overview
AgentC provides batch processing and pipeline orchestration for AI-powered tasks:
- Batch Processing - Execute pipelines across multiple records with automatic parallelization via worktrees
- Pipeline Orchestration - Define multi-step workflows with AI-powered agent steps and custom logic
- Resumable Execution - Automatically skip completed steps when pipelines are rerun
- Automatic query persistence - All interactions saved to SQLite
- Cost tracking - Detailed reports on token usage and costs
- Custom tools - File operations, grep, Rails tests, and more
- Schema validation - RubyLLM Schema support for structured responses
Installation
This gem is not pushed to rubygems. Instead, you should add a git reference to your Gemfile (use a revision because I'm going to make changes with complete disregard for backwards compatibility).
Example template
See an example template you can run in the template/ directory of this repo. Poke around there after perusing this section.
You can copy this template to start building your own.
Quick Start
A "Pipeline" is a series of prompts for Claude to perform. Data gathered from prior steps are fed into subsequent steps (you'll define an ActiveRecord class to capture the data). If any step fails, the pipeline aborts.
A "Batch" is a collection of pipelines to be run. They can be run against a single directory in series, or concurrently across multiple git worktrees. If a pipeline fails, the failure will be recorded but the batch will continue.
The necessary structures
In this example, we'll have Claude choose a random file, summarize its contents in a language of our choosing, then write it to disk and commit.
# Define the records your agent will interact with.
# Normally you'd only have one record.
#
# A versioned store saves a full db backup per-transaction
# so that you can recover from any step of the process.
# Just trying to save tokens...
class MyStore < VersionedStore::Base
include AgentC::Store
record(:summary) do
# the migration schema is defined in line
schema do |t|
# we'll input this data
t.string(:language)
# claude will generate this data
t.string(:input_path)
t.text(:summary_text)
t.text(:summary_path)
end
# this is the body of your ActiveRecord class
# add methods here as needed
end
end
# A "pipeline" processes a single record
class MyPipeline < AgentC::Pipeline
# The prompts for these steps will
# live in our prompts.yml file
agent_step(:analyze_code)
agent_step(:write_summary_to_file)
step(:finalize) do
repo.commit_all("claude: analyzed code")
end
# if this pipeline fails, we want to
# leave the repo in a clean state
# for the next pipeline.
on_failure do
repo.reset_hard_all
end
end# define your prompts in a prompts.yml file:
en:
# the key names must match up to the `agent_step` invocation above
analyze_code:
# prompts here will be cached across pipelines.
# These prompts cannot interpolate any attributes.
# Suggested use is to put as much in the cached_prompts
# as possible and put variable data in the prompt.
cached_prompts:
- "Choose a random file. Read it and summarize it in the provided language."
# You can interpolate any attribute from your record class
prompt: "lanuage: %{language}"
# Tools available:
# - dir_glob
# - read_file
# - edit_file
# - grep
# - run_rails_test
# you can add more...
tools: [read_file, dir_glob]
# The response schema defines what Claude will return.
# The keys must be attributes from your record. What Claude
# returns will automatically be saved to your record.
response_schema:
summary_text:
type: string # this is the default
required: true # this is the default
description: "The summary text"
input_path:
type: string # this is the default
required: true # this is the default
description: "The path of the file you summarized"
write_summary_to_file:
cached_prompts:
- |
You will be given some text.
Choose a well-named file and write the text to it"
prompt: "Here is the text to write: %{summary_text}"
tools: [edit_file]
response_schema:
summary_path:
description: "the path of the file you wrote"Now, make a Batch and invoke it. A batch requires a lot of configuration, related to data storage, where your repo is, and claude API credentials:
batch = Batch.new(
record_type: :summary, # the class name you want to work on
pipeline: Pipeline, # the Pipeline class you made
# A batch has a "project" and a "run". These are ways
# to track Claude usage. Your Batch will have a
# "project". Each time you call batch.new you get
# a new "run".
project: "TemplateProject",
# We'll set some spending limits. Once these are
# reached, the Batch will abort.
max_spend_project: 100.0,
max_spend_run: 20.0,
store: {
class: Store, # the Store class you made
config: {
logger: Logger.new("/dev/null"), # a logger for the store
dir: "/where/you/want/your/store/saved"
}
},
# Where Claude will work
workspace: {
dir: "/where/claude/will/be/working",
env: {
# available to your tools
# only used by run_rails_test currently
SOME_ENV_VAR: "1"
}
}
# If you prefer, you can have the Batch manage
# some git worktrees for you. It will parallelize
# your tasks across your worktrees for MAXIMUM
# TOKEN BURN.
#
# Worktrees will be created for you if you are
# starting a new Batch. If you are continuing an
# existing Batch (after an error, for example),
# the worktrees will be left in their current
# state.
#
# You must pass *either* a workspace or a repo
repo: {
dir: "/path/to/your/repo",
# an existing git revision or branch name
initial_revision: "main",
# optional: limit Claude to a subdir from your repo
working_subdir: "./",
# Where to put your worktrees
worktrees_root_dir: "/tmp/example-worktrees",
# Each worktree gets a branch, they'll be suffixed
# with a counter
worktree_branch_prefix: "summary-examples",
# Currently, this defines how many worktrees to
# create. It's obnoxious I know, but hey, it works.
worktree_envs: [{}, {}],
}
# The claude configuration:
session: {
# all chats with claude are saved to a sqlite db.
# this is separate than your Store's db because
# why throw anything away. Can be useful for
# debugging why Claude did what it did
agent_db_path: "/path/to/your/claude/db.sqlite",
logger: Logger.new("/dev/null"), # probably use the same logger for everything...
i18n_path: "/path/to/your/prompts.yml",
# as you debug your pipeline, you'll probably run it
# many times. We tag all Claude chat records with a
# project so you can track costs.
project: "SomeProject",
# only available for Bedrock...
ruby_llm: {
bedrock_api_key: ENV.fetch("AWS_ACCESS_KEY_ID"),
bedrock_secret_key: ENV.fetch("AWS_SECRET_ACCESS_KEY"),
bedrock_session_token: ENV.fetch("AWS_SESSION_TOKEN"),
bedrock_region: ENV.fetch("AWS_REGION", "us-west-2"),
default_model: ENV.fetch("LLM_MODEL", "us.anthropic.claude-sonnet-4-5-20250929-v1:0")
}
},
)
# WHEW that's a lot of config,
# Now we add some records for processing.
# The batches "store" is just a bunch of
# ActiveRecord classes, but you reference
# them by the name you gave them in the
# store.
#
# We'll add some summary records.
# This seeded data represents the input
# into your pipelines.
#
# Because your batch can be stopped and
# restarted, we need our data creation
# to be idemptotent.
record_1 = (
batch
.store
.summary
.find_or_create_by!(language: "english")
)
record_2 = (
batch
.store
.summary
.find_or_create_by!(language: "spanish")
)
# Add the records to be processed.
# add_task is idempotent
batch.add_task(record_1)
batch.add_task(record_2)
batch.call
# See the details of what happened
puts batch.report
# =>
# Summary report:
# Succeeded: 2
# Pending: 0
# Failed: 0
# Run cost: $2.34
# Project total cost: $10.40
# ---
# task: 1 - wrote summary to /tmp/example-worktrees/summary-examples-0/CHAT_TEST_SUMMARY.md
# task: 2 - wrote summary to /tmp/example-worktrees/summary-examples-1/RESUMEN_BASE.md
# Get a more detailed breakdown
cost = batch.cost
# Explore the records created
tasks = batch.store.task.all
summaries = batch.store.summary.allYou can tail your logs to see what's happening. The full text of you Claude chats are logged to DEBUG.
If you just want to see your pipeline's progression:
# Only see INFO
tail -f /path/to/log.log | grep INFOBatch errors
If your batch is interupted (by an exception or you kill it), you can continue it by simply running your batch again. The progress is persisted in the Batch's store.
If you need to correct any data or go back in time, you can peruse the store's versions by doing:
# see how many versions
puts batch.store.versions.count
# peruse your store:
batch.store.versions[12].summary.count
# restore a prior version
batch.store.version[12].restore
# re-run the batch
batch.callResetting a Batch
You can delete the sqlite database for your store.
Delete the database you configured at store: { config: { dir: "/path/to/db" } }
Debugging a Batch
If you make multiple worktrees, they will be processed concurrently. This makes things hard to debug using binding.irb.
I suggest making one worktree until it's running successfully.
Structuring your project
I suggest following the structure of the example template.
Detailed Documentation
Detailed guides for all features:
- Batch - Batch configuration, methods, and pipeline integration
- Pipeline Tips and Tricks - Useful patterns and techniques for pipelines
- Chat Methods - Using session.prompt and session.chat for direct interactions
- Tools - Built-in tools for file operations and code interaction
- Testing - Using TestHelpers::DummyChat for testing without real LLM calls
- Cost Reporting - Track token usage and costs
- Session Configuration - All configuration options
- Store Versioning - All configuration options
Requirements
- Ruby >= 3.0.0
- AWS credentials configured for Bedrock access
- SQLite3
License
WTFPL - Do What The Fuck You Want To Public License
Author
Pete Kinnecom (git@k7u7.com)