0.0
No release in over 3 years
Fetches CI test results, stores failures in SQLite, ranks by frequency, and reproduces under simulated CI conditions.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 13.0
~> 3.0

Runtime

>= 7.0
~> 2.0
 Project Readme

Flaky

Track, rank, and reproduce flaky CI test failures in Rails projects.

Flaky fetches test results from your CI provider, stores failures in a local SQLite database, ranks tests by flakiness, and helps reproduce failures under simulated CI conditions.

Installation

Add to your Gemfile:

# From GitHub
gem 'flaky', github: 'Flytedesk/flaky', group: [:development, :test]

# Or from a local path during development
gem 'flaky', path: '../flaky', group: [:development, :test]

Then bundle install.

Configuration

Create an initializer (e.g. config/initializers/flaky.rb):

if defined?(Flaky)
  Flaky.configure do |c|
    c.provider = :semaphore        # or :github_actions
    c.project  = "my-project"      # CI project name
    c.branch   = "main"            # branch to track
  end
end

Prerequisites by provider

Semaphore: Install and authenticate the sem CLI.

GitHub Actions: Install and authenticate the gh CLI.

Rake Tasks

rake flaky:fetch[age]

Fetch recent CI results and store failures in the local database.

rake flaky:fetch              # last 24 hours (default)
rake flaky:fetch[168h]        # last 7 days
rake flaky:fetch[2160h]       # last 90 days

For each workflow on the configured branch, fetches all test job logs, parses RSpec output for failures and random seeds, and inserts new records into tmp/flaky.db.

rake flaky:rank[since]

Rank flaky tests by failure frequency and suggest the next one to investigate.

rake flaky:rank               # last 30 days (default)
rake flaky:rank[7]            # last 7 days

Output:

Flaky tests on main (last 30 days, 42 CI runs):

Fails  Location                                           Last Failure
------------------------------------------------------------------------------------------
5      ...spec/system/inventory_search_modal_spec.rb:83    2026-04-12 09:15:22

  > Next to investigate: packs/.../inventory_search_modal_spec.rb:83
    Inventory search modal filters by enrollment
    Seeds: 6432, 51203, 8891

rake flaky:history[spec_location]

Show the full failure timeline for a specific test, including every seed and CI job it failed in.

rake flaky:history[inventory_search_modal_spec.rb:83]
rake flaky:history[inventory_search_modal_spec.rb]     # all failures in this file

rake flaky:stress[spec,iterations,seed,ci]

Run a test repeatedly to reproduce a flaky failure or prove a fix is stable.

# 20 iterations with random seeds
rake flaky:stress[path/to/spec.rb:83]

# 50 iterations with a specific seed and CI simulation
rake flaky:stress[path/to/spec.rb:83,50,6432,true]

Arguments:

  • spec (required) -- spec file path, optionally with line number
  • iterations -- number of runs (default: 20)
  • seed -- RSpec random seed; omit for random each run
  • ci -- true to enable CI environment simulation (default: false)

Results are recorded to the database and shown in rake flaky:report.

rake flaky:report

Summary dashboard showing overall flaky test health.

rake flaky:report

Output:

=== Flaky Test Report (main) ===

CI Runs tracked:     42
Failed runs:         8 (19.0%)
Total test failures: 14
Unique flaky specs:  6
Last fetch:          2026-04-14 20:39:57

7-day trend:         3 failures (prior 7 days: 5)
                     v Trending better

Top 5 flaky tests:
--------------------------------------------------------------------------------
  1. packs/.../inventory_search_modal_spec.rb:83 (5x)
     Inventory search modal filters by enrollment

Recent stress runs:
--------------------------------------------------------------------------------
  packs/.../inventory_search_modal_spec.rb:83 -- 18/20 passed (10.0% failure rate) [CI sim]

CI Simulation

When ci=true is passed to rake flaky:stress, the gem simulates CI environment constraints:

  1. Rack middleware latency -- adds 30ms delay per HTTP request (approximates the difference between a Mac and an f1-standard-2 CI machine). Configurable via FLAKY_LATENCY_MS env var.

  2. Reduced Puma threads -- the host app should conditionally reduce Capybara's Puma threads when FLAKY_CI_SIMULATE=1 is set:

# spec/support/capybara_drivers.rb (or equivalent)
max_threads = ENV["FLAKY_CI_SIMULATE"] ? 2 : 8
Capybara.server = :puma, { Silent: true, Threads: "1:#{max_threads}" }

The middleware is auto-inserted by the Railtie in test environment when FLAKY_CI_SIMULATE=1.

Database

Failures are stored in SQLite at tmp/flaky.db (auto-created on first use). The schema is managed internally and migrated automatically.

Tables:

  • ci_runs -- one row per CI workflow on the tracked branch
  • job_results -- one row per test job (unit tests, system tests, etc.)
  • test_failures -- one row per individual test failure with spec file, line, description, and seed
  • stress_runs -- one row per stress test session

The database is local and should be gitignored (typically already is via tmp/).

Custom Providers

To add a CI provider, implement the three-method interface and register it:

class Flaky::Providers::CircleCI < Flaky::Providers::Base
  def fetch_workflows(age: "24h")
    # Return [{ id:, pipeline_id:, branch:, created_at: }, ...]
  end

  def fetch_jobs(pipeline_id:)
    # Return [{ id:, name:, block_name:, result: }, ...]
  end

  def fetch_log(job_id:)
    # Return raw log string
  end
end

Flaky.register_provider(:circleci, Flaky::Providers::CircleCI)

The log parser is CI-agnostic -- it extracts failures, seeds, and counts from standard RSpec output. Your provider just needs to return the raw log text.

Typical Workflow

# 1. Fetch recent CI data
rake flaky:fetch[168h]

# 2. See what's flaky
rake flaky:rank

# 3. Investigate the top offender
rake flaky:history[the_flaky_spec.rb:42]

# 4. Try to reproduce it locally with CI simulation
rake flaky:stress[the_flaky_spec.rb:42,30,6432,true]

# 5. Fix the test, then prove the fix holds
rake flaky:stress[the_flaky_spec.rb:42,50,,true]

# 6. Check overall health
rake flaky:report

License

MIT