Que::Unique
Que::Unique is a gem that ensures that identical que jobs
are not scheduled multiple times during a
transaction block. If the same job with the same args is detected, it will be coalesced into one.
A typical use case would be modifying a customer at various points during
a code route, and wanting to index it once in elasticsearch afterwards.
Use:
# Add to Gemfile
gem 'que-unique'
# Add the `include` to your job
class SomeUniqueJob < Que::Job
include Que::Unique
endNow, when in a transaction, only one of any set of args (as json'd) will be enqueued.
Examples
Without que-unique:
IndexCustomer.enqueue(3)
... business logic
IndexCustomer.enqueue(3)
... business logic
IndexCustomer.enqueue(3)
=> Results in 3 identical index jobsWith que-unique:
IndexCustomer.enqueue(3)
... business logic
IndexCustomer.enqueue(3)
... business logic
IndexCustomer.enqueue(3)
=> Results in 1 index jobWith que-unique, demonstrating different args:
IndexCustomer.enqueue(3)
... business logic
IndexCustomer.enqueue(426)
... business logic
IndexCustomer.enqueue(3)
=> Results in 2 index jobs, one with arg "3", one with arg "426"Note, if you are attempting to prevent two identical jobs from executing concurrently that are already enqueued, then you probably want to use another excellent gem, que-locks.
Internal workings
Internally, Que::Scheduler works by prepending a module to
ActiveRecord::ConnectionAdapters::DatabaseStatements that wraps the transaction method,
where it starts a thread local array which holds a hash of JSON strings of the arguments
that have been scheduled. We also start a monitor to check how deep we are in the
transaction nesting. If a nested transaction is detected, the increment goes up.
Once we detect that the transaction count has come back down to zero, we can conclude that we have left the transaction boundary, and the transaction is being committed. We enqueue the required jobs and clear the thread locals.
Comparison with que-locks
There is another gem called que-locks that does similar things to que-unique. They use very different techniques, so the semantics are not the same.
-
The
que-uniquegem performs its deduping in-memory, in one transaction, in a single thread. This means it is fast / has no network overhead. It does mean, though, that if you have two concurrent transactions, they may both enqueue a job which needn't be run twice.The
que-locksgem performs its deduping by locking rows in the DB. This can help mitigate cross-transaction dupe enqueueing at the point ofenqueue(though if the race is fast enough, some duplicate rows will make it through). It does entail more network / DB overhead. -
que-uniquesupports--worker-countgreater than one.que-locksdoesn't currently. -
que-uniquedoes nothing to stop duplication once the rows are enqueued.que-locksdoes, by checking for duplicate rows and skipping duplicates where possible.
The above means that the gems can work in tandem. At enqueue time que-unique can prevent "trivial"
duplicates quickly in memory, then que-locks which can do a (slightly more expensive) lower level
DB check before the final insert. que-locks can then also perform post-enqueue deduping.
It is important to note that even using both a the same time cannot prevent all duplicates in a fast moving multi-threaded system. Make sure you always write idempotent jobs.
Development
- Ensure you have a postgres running locally. You can do so easily with docker:
docker run -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres:14.7
- Check out this repo, then run the tests with the following:
bundle install bin/rspec
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/bambooengineering/que-unique.
License
The gem is available as open source under the terms of the MIT License.