Service Downtime Simulator
🚨 If you work at Deliveroo and you're contributing to this project, please bear in mind that this repository is public.
This is a piece of Rack middleware that simulates failures you would want to tolerate in upstream services.
Installation
Rails
Add the following in application.rb:
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
config # See below for info about how to configure this
)Configuration
The middleware takes a config argument in the form of a hash. Said hash should have the following shape:
{
enabled: Boolean,
mode: Symbol,
excluded_paths: Array<String>,
logger: Logger?
}Here's what you can supply for each of those options:
-
enabled (
Boolean)-
truewill enable simulation of failures (assuming you supply a validmode, see below) -
falsewill disable simulation and your application will function as normal
-
-
mode (
Symbol)-
:hard_downwill cause all requests to return a 500 error -
:intermittently_downwill cause 50% of requests to return a 500 error -
:successful_but_gibberishwill return a 200, but with a response body that is not machine readable -
:timing_outwill wait for 15 seconds on each request, and then return a 503
-
-
excluded_paths (
Array<String>)- You can supply a list of paths that you don't want to be affected by the simulation here (e.g.
['/foobar']) - The most common thing you're going to want to include here is your service's health check endpoint, as if it is returning a 5xx thanks to this middleware your application will not deploy
- You can supply a list of paths that you don't want to be affected by the simulation here (e.g.
-
logger (
Logger?)- If supplied, useful debug information will be sent here
In order for the middleware to kick in, enabled must be explicitly set to true and mode must be a valid option. Unless both are explicitly supplied, the underlying application will continue to function as normal.
Examples
Here's a couple of example configurations:
Hard-coded Hard Down
This example will always return a 500 for all requests.
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
{
enabled: true,
mode: :hard_down,
excluded_paths: ['/health'],
logger: Rails.logger
}
)Environment-variable Controlled Simulation
This is a more practical example, allowing failure simulation to happen based on environment variables. It requires an environment variable with a specific value to enable the failure simulation, and also requires a mode to be provided. If either are missing, the app continues as normal. You can also use this pattern for feature flagging. Probably.
config.middleware.use(
ServiceDowntimeSimulator::Middleware,
{
enabled: ENV['FAILURE_SIMULATION_ENABLED'] == 'I_UNDERSTAND_THE_CONSEQUENCES_OF_THIS',
mode: ENV.fetch('FAILURE_SIMULATION_MODE', '').to_sym,
excluded_paths: ['/health'],
logger: Rails.logger
}
)Development
- Clone this repository
- Ensure you have Ruby 2.5.1 installed
-
make installto get the dependencies -
make testto run the tests -
make lintto lint your code - ???
- Profit
Gem Publishing
TBC, but very manual and involved flow is:
- Update version in
lib/service_downtime_simulator.rband commit - Tag version via
git tag XXX - Push (
git push origin head --tags) - Release to Rubygems (
make publish)