0.0
No release in over 3 years
CLI tool that queries SQLite databases across remote hosts via SSH, aggregating results in real-time. Designed for multi-tenant apps where each tenant has their own SQLite database.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

 Project Readme

SQLiteSweep

Query millions of SQLite databases across remote hosts via SSH, aggregating results in real-time. Designed for multi-tenant apps where each tenant has their own SQLite database.

demo

Installation

gem install sqlitesweep

Or add to your Gemfile:

gem "sqlitesweep"

Usage

sqlitesweep \
  -q 'SELECT count(*) FROM products WHERE price > 0' \
  -a sum \
  -s 'rails runner "Account.active.find_each { |a| puts a.db_uri }"' \
  -c 16

The -s (source) command should output one database URI per line. URIs can be:

  • Local paths: /data/tenants/acme.sqlite3
  • File URIs: file:///data/tenants/acme.sqlite3
  • SSH URIs: ssh://deploy@web1.example.com/data/tenants/acme.sqlite3

Results go to stdout, progress goes to stderr — so it's pipe-friendly:

sqlitesweep -q "SELECT count(*) FROM users" -a sum -s "cat db_uris.txt" > result.txt

Actions

Action Description
sum Sum the first column across all databases (default)
average / avg Average the first column across all databases
list Write all rows to a JSONL file, print the file path

Options

Flag Long Default Description
-q --query (required) SQL query to execute on each database
-a --action sum sum, average/avg, or list
-s --source (required) Shell command that outputs database URIs
-c --concurrency 8 Max parallel query workers
--max-ssh 50 Max SSH master connections
--no-live false Disable live progress display
--batch-size 4 Databases to query per SSH call
--ssh-timeout 10 SSH connect timeout (seconds)
--query-timeout 30 Per-query timeout (seconds)

Architecture

┌──────────────────────────────────────────────────────────────────────────────┐
│                              sqlitesweep                                     │
└──────────────────────────────────────────────────────────────────────────────┘

  ┌─────────────────────────────┐
  │  Source Command  (-s)       │   Any shell command that outputs database
  │                             │   URIs to stdout, one per line.
  │  rails runner "..."        │   e.g. Rails runner, cat, curl, script...
  └─────────────┬───────────────┘
                │
                │  streams URIs line by line
                ▼
  ┌─────────────────────────────┐
  │  URI Router                 │   Parses each URI and routes it:
  │                             │
  │  /local/path.sqlite3  ─────────▶  direct to Worker Pool
  │  ssh://user@host/path  ────────▶  to Host Batcher
  └─────────────────────────────┘
                │
       ┌────────┴────────┐
       ▼                 ▼
  ┌──────────┐    ┌──────────────────────────────────────────────────────┐
  │  Local   │    │  Host Batcher                                       │
  │  Query   │    │                                                     │
  │          │    │  Groups URIs by host. Flushes when a batch reaches  │
  │  Opens   │    │  --batch-size (default 4) or after 200ms timeout.   │
  │  SQLite  │    │                                                     │
  │  directly│    │  ┌──────────┐  ┌──────────┐  ┌──────────┐          │
  │  via gem │    │  │ @host-1  │  │ @host-2  │  │ @host-3  │  ...     │
  │          │    │  │ db1,db2  │  │ db4      │  │ db5,db6  │          │
  │          │    │  │ db3      │  │          │  │ db7,db8  │          │
  │          │    │  └────┬─────┘  └────┬─────┘  └────┬─────┘          │
  └────┬─────┘    └───────┼─────────────┼─────────────┼────────────────┘
       │                  │             │             │
       │                  ▼             ▼             ▼
       │           ┌─────────────────────────────────────────────────┐
       │           │  SSH Connection Manager                         │
       │           │                                                 │
       │           │  One ControlMaster per host (reused across all  │
       │           │  queries to that host). Multiplexed via Unix    │
       │           │  socket — no repeated TCP/auth handshakes.      │
       │           │                                                 │
       │           │  Semaphore caps total masters at --max-ssh.     │
       │           │  BatchMode=yes — no password prompts.           │
       │           └──────────────────────┬──────────────────────────┘
       │                                  │
       │                                  ▼
       │           ┌─────────────────────────────────────────────────┐
       │           │  Remote Query (one SSH call per batch)          │
       │           │                                                 │
       │           │  ssh user@host '                                │
       │           │    printf "%s\t" /path/db1;                     │
       │           │    sqlite3 -json /path/db1 "SELECT ..."; echo;  │
       │           │    printf "%s\t" /path/db2;                     │
       │           │    sqlite3 -json /path/db2 "SELECT ..."; echo;  │
       │           │  '                                              │
       │           │                                                 │
       │           │  Multiple databases queried in a single SSH     │
       │           │  round-trip. Results parsed from tab-delimited  │
       │           │  output back into individual results.           │
       │           └──────────────────────┬──────────────────────────┘
       │                                  │
       ▼                                  ▼
  ┌──────────────────────────────────────────────────────────────────────┐
  │  Worker Pool  (-c, default 8 threads)                                │
  │                                                                      │
  │  Fixed thread pool runs queries in parallel. Back-pressure via       │
  │  caller-runs policy: when all threads are busy, the submitting       │
  │  thread executes the work itself instead of queuing unboundedly.     │
  └───────────────────────────────┬──────────────────────────────────────┘
                                  │
                                  │  results (thread-safe)
                                  ▼
  ┌──────────────────────────────────────────────────────────────────────┐
  │  Aggregator                                                          │
  │                                                                      │
  │  Mutex-protected accumulator. Mode depends on --action:              │
  │                                                                      │
  │    sum      Running total of first column value from each database.  │
  │    average  Running total / number of databases queried.             │
  │    list     Streams rows to a JSONL temp file (bounded memory).      │
  └───────────────────────────────┬──────────────────────────────────────┘
                                  │
                ┌─────────────────┴─────────────────┐
                ▼                                   ▼
  ┌──────────────────────────┐        ┌──────────────────────────┐
  │  Live Display (stderr)   │        │  Final Result (stdout)   │
  │                          │        │                          │
  │  Queried: 8421           │        │  sum/average ▶ number    │
  │  Errors: 2               │        │  list ▶ /tmp/file.jsonl  │
  │  Rate: 1204/s            │        │                          │
  │  Elapsed: 7.0s           │        │  Pipe-friendly: stderr   │
  │  Result: 421052          │        │  for progress, stdout    │
  │                          │        │  for the answer.         │
  └──────────────────────────┘        └──────────────────────────┘

Requirements

  • Ruby >= 4.0
  • sqlite3 available on remote hosts (for SSH queries)
  • SSH agent or key-based auth configured (BatchMode is enforced — no password prompts)

Development

bundle install
bundle exec rake test

Run the integration test:

bundle exec ruby test/integration/test_local_sweep.rb

Watch the live progress display with a slow-drip demo:

ruby test/integration/harness/live_demo.rb
ruby test/integration/harness/live_demo.rb --count 50 --delay 0.5
ruby test/integration/harness/live_demo.rb --action list

Run the benchmark:

ruby test/integration/harness/sweep_bench.rb --db-count 1000

Docker-based SSH testing

docker compose -f test/integration/harness/docker-compose.yml up -d
ruby test/integration/harness/sweep_bench.rb --docker --db-count 500

License

MIT