jobba
Redis-based background job status tracking.
Installation
# Gemfile
gem 'jobba'or
$> gem install jobba
Version 1.x.x follows the scheme, 1.major_change.minor_change. Normal semantic versioning (major/minor/patch) will begin with version 2.0.0.
Configuration
To configure Jobba, put the following code in your applications initialization logic (eg. in the config/initializers in a Rails app):
Jobba.configure do |config|
# Whatever options should be passed to `Redis.new` (see https://github.com/redis/redis-rb)
config.redis_options = { url: "redis://:p4ssw0rd@10.0.1.1:6380/15" }
# top-level redis prefix
config.namespace = "jobba"
endGetting status objects
If you know you need a new Status, call create!:
Jobba.create!If you are looking for a status:
Jobba.find(id)which will return nil if no such Status is found. If you always want a Status object back,
call:
Jobba.find!(id)The result of find! will start in an unknown state if the ID doesn't exist in Redis.
Basic Use with ActiveJob
class MyJob < ::ActiveJob::Base
def self.perform_later(an_arg:, another_arg:)
status = Jobba.create!
args.push(status.id)
# In theory we'd mark as queued right after the call to super, but this messes
# up when the activejob adapter runs the job right away
status.queued!
super(*args, &block)
# return the Status ID in case it needs to be noted elsewhere
status.id
end
def perform(*args, &block)
# Pop the ID argument added by perform_later and get a Status
status = Jobba.find!(args.pop)
status.started!
# ... do stuff ...
status.succeeded!
end
endChange States
One of the main functions of Jobba is to let a job advance its status through a series of states:
unqueuedqueuedstartedsucceededfailedkilledunknown
Put a Status into one of these states by calling that_state!, e.g.
my_state.started!The unqueued state is entered when a Status is first created. The unknown state is entered when find!(id) is called but the id is not known. You can re-enter these states with the ! methods, but note that the recorded_at timestamp will not be updated.
The first time a state is entered, a timestamp is recorded for that state. Not all timestamp names match the state names:
| State | Timestamp |
|---|---|
| unqueued | recorded_at |
| queued | queued_at |
| started | started_at |
| succeeded | succeeded_at |
| failed | failed_at |
| killed | killed_at |
| unknown | recorded_at |
There is also a special timestamp for when a kill is requested, kill_requested_at. More about this later.
The order of states is not enforced, and you do not have to use all states. However, note that you'll only be able to query for states you use (Jobba doesn't automatically travel through states you skip) and if you're using an unusual order your time-based queries will have to reflect that order.
Restarts
Generally-speaking, you should only enter any state once. Jobba only records the timestamp the first time you enter a state.
The exception to this rule is that if call started! a second time, Jobba will note this as a restart. The current values in the status will be archived and your status will look like a started status, with the exception that the attempt count will be incremented. A restarted status can then enter succeeded, failed, or killed states and those timestamps will be stored. job_name, job_args and provider_job_id survive the restart.
The attempt field is zero-indexed, so the first attempt is attempt 0.
Mark Progress
If you want to have a way to track the progress of a job, you can call:
my_status.set_progress(0.7) # 70% complete
my_status.set_progress(7,10) # 70% complete
my_status.set_progress(14,20) # 70% completeThis is useful if you need to show a progress bar on your client, for example.
Recording Job Errors
The status can keep track of a list of errors. Errors can be anything, as long as they are JSON-friendly.
my_status.add_error("oh nooo!!")
my_status.add_error(msg: "oh nooo!!", data: 42)Errors are available from an errors attribute
my_status.errors # => ["oh nooo!!", {"msg" => "oh nooo!!", "data" => 42}]Saving Job-specific Data
Jobba provides a data field in all Status objects that you can use for storing job-specific data. Note that the data must be in a format that can be serialized to JSON. Recommend sticking with basic data types, arrays, primitives, hashes, etc.
my_status.save({a: 'blah', b: [1,2,3]})
my_status.save("some string")Normalization of Saved Data and Errors
Note that if you save or add_error contains a hash with symbol keys, those keys will be converted to strings. In fact, any argument passed in to these methods will be converted to JSON and parsed back again so that the data and errors attributes returns the same thing regardless of if they are retrieved immediately after being set or after being loaded from Redis.
Setting Job Name, Arguments and Provider Job ID
If you want to be able to query for all statuses for a certain kind of job, you can set the job's name in the status:
my_status.set_job_name("MySpecialJob")If you want to be able to query for all statuses that take a certain argument as input, you can set job arguments on a status:
my_status.set_job_args(arg_1_name: arg_2, arg_2_name: arg_2)where the keys are what the argument is called in your job (e.g. "input_1") and the values are a way to identify the argument (e.g. "gid://app/Person/72"). The values must currently be strings.
You probably will only want to track complex arguments, e.g. models in your application. E.g. you could have a Book model and a PublishBook background job and you may want to see all of the PublishBook jobs that have status for the Book with ID 53.
Note that you can set job args with names that are either symbols or strings, but you can only read the args back by the string form of their name, e.g.
If you want to be able to query for the status for a specific job record or to find the job record associated with a status, you can set the job's provider_job_id in the status:
my_status.set_provider_job_id(42)my_status.set_job_args(foo: "bar")
my_status.job_args['foo'] # => "bar"
my_status.job_args[:foo] # => nilKilling Jobs
While Jobba can't really kill jobs (it doesn't control your job-running library), it has a facility for marking that you'd like a job to be killed.
a_status.request_kill!Then a job itself can occassionally come up for air and check
my_status.kill_requested?and if that returns true, it can attempt to gracefully terminate itself.
Note that when a kill is requested, the job will continue to be in some other state (e.g. started) until it is in fact killed, at which point the job should call:
my_status.killed!to change the state to killed.
Status Objects
When you get hold of a Status, via create!, find, find!, or as the result of a query, it will have the following attributes (some of which may be nil):
| Attribute | Description |
|---|---|
id |
A Jobba-created UUID |
state |
one of the states above |
progress |
a float between 0.0 and 1.0 |
errors |
an array of errors |
data |
job-specific data |
job_name |
The name of the job |
job_args |
An hash of job arguments, {arg_name: arg, ...} |
recorded_at |
Ruby Time timestamp |
queued_at |
Ruby Time timestamp |
started_at |
Ruby Time timestamp |
succeeded_at |
Ruby Time timestamp |
failed_at |
Ruby Time timestamp |
killed_at |
Ruby Time timestamp |
recorded_at |
Ruby Time timestamp |
kill_requested_at |
Ruby Time timestamp |
A Status object also methods to check if it is in certain states:
reload!unqueued?queued?started?succeeded?failed?killed?unknown?
And two conveience methods for checking groups of states:
completed?incomplete?
You can also call reload! on a Status to have it reset its state to what is stored in Redis.
Deleting Job Statuses
Once jobs are completed or otherwise no longer interesting, it'd be nice to clear them out of Redis. You can do this with:
my_status.delete # freaks out if `my_status` isn't completed
my_status.delete! # always deletesQuerying for Statuses
Jobba has an activerecord-like query interface for finding Status objects.
Basic Query Examples
Getting All Statuses
Jobba.allState
Jobba.where(state: :unqueued)
Jobba.where(state: :queued)
Jobba.where(state: :started)
Jobba.where(state: :succeeded)
Jobba.where(state: :failed)
Jobba.where(state: :killed)
Jobba.where(state: :unknown)Two convenience "state" queries have been added:
Jobba.where(state: :completed) # includes succeeded, failed
Jobba.where(state: :incomplete) # includes unqueued, queued, started, killedYou can query combinations of states too:
Jobba.where(state: [:queued, :started])State Timestamp
Jobba.where(recorded_at: {after: time_1})
Jobba.where(queued_at: [time_1, nil])
Jobba.where(started_at: {before: time_2})
Jobba.where(started_at: [nil, time_2])
Jobba.where(succeeded_at: {after: time_1, before: time_2})
Jobba.where(failed_at: [time_1, time_2])Note that you cannot query on kill_requested_at. The time arguments can be Ruby Time objects or a number of microseconds since the epoch represented as a float, integer, or string.
Note that, in operations having to do with time, this gem ignores anything beyond microseconds.
Job Name
(requires having called the optional set_job_name method)
Jobba.where(job_name: "MySpecialBackgroundJob")
Jobba.where(job_name: ["MySpecialBackgroundJob", "MyOtherJob"])Job Arguments
(requires having called the optional set_job_args method)
Jobba.where(job_arg: "gid://app/MyModel/42")
Jobba.where(job_arg: "gid://app/Person/86")Status IDs
Jobba.where(id: nil)
Jobba.where(id: [])
Jobba.where(id: "some_id")
Jobba.where(id: ["an_id", "another_id"])Query Chaining
Queries can be chained! (intersects the results of each where clause)
Jobba.where(state: :queued).where(recorded_at: {after: some_time})
Jobba.where(job_name: "MyTroublesomeJob").where(state: :failed)Sort Order
Currently, results from queries are not guaranteed to be in any order. You can sort them yourself using normal Ruby calls.
Running a Query to get Statuses
Jobba.where(...).runWhen you call run on a query, you'll get back a Statuses object, which is simply a collection of Status objects with a few convenience methods and bulk operations.
Bulk Methods on Statuses
delete_alldelete_all!request_kill_all!
These work like describe above for individual Status objects.
There is also a not-very-tested multi operation that takes a block and executes the block inside a Redis multi call. Do not use it unless you really know what you are doing.
my_statuses.multi do |status, redis|
# do stuff on `status` using the `redis` connection
endArray-like Methods on Statuses
any?none?all?mapcollectempty?countselect!reject!
If you want to get an array of Status objects from a Statuses object, just call
a_statuses_object.to_aselect! and reject!, as you would expect, operate in place and also return self.
Passthrough Methods on Queries
As a convenience, if you call a method on Query that isn't defined there but is defined on Statuses, a new Statuses object will be created for you and your method called on it.
Jobba.where(state: :queued).collect(&:queued_at)is the same as
Jobba.where(state: :queued).run.collect(&:queued_at)Query Counts
Notably, both Query and Statuses define the count and empty? methods. Which ones you use affects if the counting is done in Redis or in Ruby:
Jobba.where(...).count # These count in Redis
Jobba.where(...).empty?
Jobba.all.count
Jobba.where(...).run.count # These pull data back to Ruby and count in Ruby
Jobba.where(...).run.empty?Pagination
Pagination is supported with an ActiveRecord-like interface. You can call .limit(x) and .offset(y) on
queries, e.g.
Jobba.where(state: :succeeded).limit(10).offset(20).to_aSpecifying a limit does not guarantee that you'll get that many elements back, as there may not be that many left in the result.
Notes
Times
Note that, in operations having to do with time, this gem ignores anything beyond microseconds.
Efficiency
Jobba strives to do all of its operations as efficiently as possible using built-in Redis operations. If you find a place where the efficiency can be improved, please submit an issue or a pull request.
Single-clause queries (those with one where call) have been optimized. Jobba.all is a single-clause query. If you have lots of IDs, try to get by with single-clause queries. Multi-clause queries (including count) have to copy sets into temporary working sets where query clauses are ANDed together. This can be expensive for large datasets.
Write from one; Read from many
Jobba assumes that any job is being run at one time by only one worker. Jobba makes no accomodations for multiple processes updating a Status at the same time; multiple processes reading of a Status are fine of course.
Development
By default, this gem uses fakeredis instead of real Redis. This is great most of the time, but occassionally fakeredis doesn't work exactly like real Redis. If you want to use real Redis, just set the USE_REAL_REDIS environment variable to true, e.g.
$> USE_REAL_REDIS=true rspec
Travis runs the specs with both fakeredis and real Redis.
Clauses need to implement three methods:
-
to_new_set- puts the IDs indicated by the clause into a new sorted set in redis -
result_ids- used to get the IDs indicated by the clause when the clause is the only one in the query -
result_count- used to get the count of IDs indicated by the clause when the clause is the only one in the query
TODO
- Provide job min, max, and average durations.
- Specs that test scale.
- Move redis code in
set_job_args,set_job_name, andsaveintosetto match rest of code.