browserctl

persistent browser sessions with human-in-the-loop.

_{Built for AI agents. Useful to the engineers and QA folks who work with them.}

Every browser automation tool restarts the browser when your script ends. That means re-authenticating, re-navigating, re-loading state — on every run. browserctl doesn't restart. The session stays alive between commands, so you pick up exactly where you left off.

browserd &                                               # start the daemon (headless)
browserctl page open main --url https://example.com/login
browserctl page snapshot main                                 # AI-friendly JSON snapshot with ref IDs
browserctl fill main --ref e1 --value me@example.com    # interact by ref, no selectors needed
browserctl click main --ref e2
browserctl daemon stop

Quick Start

# 1. Install
gem install browserctl

# 2. Start the daemon
browserd &

# 3. Open a named page
browserctl page open main --url https://moatazeldebsy.github.io/test-automation-practices/#/auth

# 4. Snapshot — returns JSON with a ref ID per interactable element
browserctl page snapshot main
# → [{"ref":"e1","tag":"input","attrs":{"data-test":"username-input"}}, {"ref":"e2",...}, {"ref":"e3","tag":"button","text":"Login",...}]

# 5. Interact using the ref IDs from the snapshot
browserctl fill main --ref e1 --value admin
browserctl fill main --ref e2 --value admin
browserctl click main --ref e3

# 6. Observe
browserctl url main
browserctl page snapshot main --diff       # only what changed

# Session persistence: save now, pick up later
browserctl state save my-session
# On a fresh daemon tomorrow: `browserctl state load my-session`
# → tabs restored, cookies intact, no re-login needed

# 7. Done
browserctl daemon stop

→ Full Getting Started guide

See it in action

Terminal
_{CLI commands, live output, session persistence proof}

Browser
_{What the browser sees as those commands run}

Use cases

AI coding agent authenticating into a staging environment — the agent logs in once, the session persists, subsequent commands run inside the authenticated context without re-authenticating between steps.

Developer reproducing a multi-step bug report — navigate to the failure point once, then iterate on the fix with the browser already in the right state; no restarting from the home page each run.

Automated smoke test that needs human sign-off — the test runs until it hits something ambiguous, calls browserctl pause, lets a human inspect and act, then browserctl resume hands control back to the script with all state intact.

Why browserctl?

Most automation tools are stateless — every script spins up a fresh browser and tears it down. browserctl doesn't.

Capability	browserctl	Playwright / Selenium
Session persists across commands	✓	✗ (per-script lifecycle)
Named page handles	✓	✗
AI-friendly DOM snapshot	✓	✗
Human-in-the-loop pause/resume	✓	✗
Lightweight CLI interface	✓	✗
Full browser automation API	—	✓
Parallel multi-browser testing	—	✓

Use browserctl when you need a browser that stays alive and remembers state — for AI agents, iterative dev workflows, or tasks that mix automation with human judgment.

Use Playwright/Selenium when you need parallel test suites, multi-browser support, or a full programmatic API.

Installation

Requirements: Ruby >= 3.3 · Chrome or Chromium installed

macOS (Homebrew — recommended)

brew install patrick204nqh/tap/browserctl

RubyGems

gem install browserctl

Or in your Gemfile (for projects using the client API directly):

gem "browserctl"

Claude Code Plugin

browserctl ships as a Claude Code plugin. Install it once and Claude automatically knows how to use the daemon, ref-based interaction, HITL patterns, and workflow authoring.

Interactive install

/plugin marketplace add patrick204nqh/browserctl
/plugin install browserctl@browserctl

Project settings — commit .claude/settings.json to share with your team:

{
  "extraKnownMarketplaces": {
    "browserctl": {
      "source": { "source": "github", "repo": "patrick204nqh/browserctl" }
    }
  },
  "enabledPlugins": {
    "browserctl@browserctl": true
  }
}

Once installed, the browserctl skill loads automatically.

How it works

browserd runs as a background process, listening on a Unix socket at ~/.browserctl/browserd.sock. It manages a Ferrum (Chrome DevTools Protocol) browser instance with named page handles. browserctl sends JSON-RPC commands over the socket and prints the result.

Start multiple named instances for agent isolation:

browserd --name agent-a &
browserd --name agent-b &
browserctl --daemon agent-a page open main --url https://app.example.com

The daemon shuts itself down after 30 minutes of inactivity.

Documentation


Getting Started	Install, first session, first snapshot
Agent Integration	Call browserctl from Python, shell, or Anthropic tool-use agents
Concepts	Sessions, snapshots, state, flows, human-in-the-loop
Guides	Writing workflows, handling challenges, smoke testing
Debugging	Read traces, redaction, crash reports, filing a good issue
Examples	Runnable scripts: session reuse, Cloudflare HITL, and more
Command Reference	Every command and flag
API Stability	Wire protocol contract and stability zones
CHANGELOG	Release history
Product	What browserctl is and who it's for
Vision & Roadmap	Philosophy and release roadmap
vs. agent-browser	How browserctl differs from Vercel's agent-browser

Development

git clone https://github.com/patrick204nqh/browserctl
cd browserctl
bin/setup              # brew bundle (macOS) + bundle install + Chrome check

bundle exec rspec      # run tests
bundle exec rubocop    # lint

rake demo               # full pipeline: screenshots + browser GIF + terminal GIF
rake demo:screenshots   # smoke test screenshots only
rake demo:browser_gif   # browser animation only  (requires: ffmpeg)
rake demo:terminal      # terminal GIF only        (requires: vhs)

Demo assets are regenerated automatically on every push to main that touches demo/ or the login example.

Contributing

See CONTRIBUTING.md · SECURITY.md

License

MIT

Built by Patrick — I built this because I was building AI agents that needed authenticated web sessions, and every automation tool I tried restarted the browser between runs.

browserctl

Development

Runtime

browserctl

Quick Start

See it in action

Use cases

Why browserctl?

Installation

Claude Code Plugin

How it works

Documentation

Development

Contributing

License