fiber_connection_pool
Fiber-based generic connection pool
A connection pool meant to be used inside a Fiber-based reactor, such as any EventMachine or Celluloid server.
Widely based on ConnectionPool
from em-synchrony gem, and
some things borrowed also from
threaded connection_pool gem.
Used in production environments
with Goliath
(EventMachine based) servers,
and in promising experiments with
Reel
(Celluloid based) servers.
Install
Add this line to your application's Gemfile:
gem 'fiber_connection_pool'
Or install it yourself as:
$ gem install fiber_connection_pool
Inside of your Ruby program, require FiberConnectionPool with:
require 'fiber_connection_pool'
How It Works
pool = FiberConnectionPool.new(:size => 5){ MyFancyConnection.new }
It just keeps an array (the internal pool) holding the result of running
the given block size times. Inside the reactor loop (either EventMachine's or Celluloid's),
each request is wrapped on a Fiber, and then pool
plays its magic.
results = pool.query_me(sql)
When a method query_me
is called on pool
it:
- Reserves one connection from the internal pool and associates it with the current fiber.
- If no connection is available, then that fiber stays on a pending queue, and is yielded until another connection is released.
- When a connection is available, then the pool calls
query_me
on thatMyFancyConnection
instance. - When
query_me
returns, the reserved instance is released again, and the next fiber on the pending queue is resumed. - The return value is sent back to the caller.
Methods from MyFancyConnection
instance should yield the fiber before
perform any blocking IO. That returns control to te underlying reactor,
that spawns another fiber to process the next request, while the previous
one is still waiting for the IO response. That new fiber will get its own
connection from the pool, or else it will yield until there
is one available. That behaviour is implemented on Mysql2::EM::Client
from em-synchrony,
and on a patched version of ruby-mysql, for example.
The whole process looks synchronous from the fiber perspective, because it is indeed. The fiber will really block ( or yield ) until it gets the result.
results = pool.query_me(sql)
puts "I waited for this: #{results}"
The magic resides on the fact that other fibers are being processed while this one is waiting.
Not thread-safe
FiberConnectionPool
is not thread-safe. You will not be able to use it
from different threads, as eventually it will try to resume a Fiber that resides
on a different Thread. That will raise a FiberError( "calling a fiber across threads" ).
Maybe one day we add that feature too. Or maybe it's not worth the added code complexity.
We use it with no need to be thread-safe on Goliath servers having one pool on each server instance,
and on Reel servers having one pool on each Actor thread. Take a look at the examples
folder for details.
Generic
We use it extensively with MySQL connections with Goliath servers by using Mysql2::EM::Client
from em-synchrony.
And for Celluloid by using a patched version of ruby-mysql.
By >=0.2 there is no MySQL-specific code, so it can be used with any kind of connection that can be fibered.
Take a look at the examples
folder to see it can be used seamlessly with MySQL and MongoDB.
You could do it the same way with CouchDB, etc. , or anything you would put on a pool inside a fiber reactor.
Reacting to connection failure
When the call to a method raises an Exception it will raise as if there was no pool between your code and the connetion itself. You can rescue the Exception as usual and react as you would do normally.
You have to be aware that the connection instance will remain in the pool, and other fibers will surely use it. If the Exception you rescued indicates that the connection should be recreated or treated somehow, there's a way to access that particular connection:
pool = FiberConnectionPool.new(:size => 5){ MyFancyConnection.new }
# state which exceptions will need treatment
pool.treated_exceptions = [ BadQueryMadeMeWorse ]
begin
pool.bad_query('will make me worse')
rescue BadQueryMadeMeWorse # rescue and treat only classes on 'treated_exceptions'
pool.with_failed_connection do |connection|
puts "Replacing #{connection.inspect} with a new one!"
MyFancyConnection.new
end
rescue Exception => ex # do not treat the rest of exceptions
log ex.to_s # -> 'You have a typo on your sql...'
end
The pool saves the connection when it raises an exception on a fiber, and with with_failed_connection
lets
you execute a block of code over it. It must return a connection instance, and it will be put inside the pool
in place of the failed one. It can be the same instance after being fixed, or maybe a new one.
The call to with_failed_connection
must be made from the very same
fiber that raised the exception. The failed connection will be kept out of the pool,
and reserved for treatment, only if the exception is one of the given in treated_exceptions
.
Otherwise with_failed_connection
will raise NoReservedConnection
.
Also the reference to the failed connection will be lost after any method execution from that
fiber. So you must call with_failed_connection
before any other method that may acquire a new
instance from the pool.
Any reference to a failed connection is released when the fiber is dead, but as you must access it from the fiber itself, worry should not.
Save data
Sometimes we need to get something more than de return value from the query_me
call,
but that something is related to that call on that connection.
For example, maybe you need to call affected_rows
right after the query was
made on that particular connection.
If you make that extra calls on the pool
object, it will acquire a new connection
from the pool an run on it. So it's useless.
There is a way to gather all that data from the connection so we can work on it,
but also release the connection for other fiber to use it.
# define the pool
pool = FiberConnectionPool.new(:size => 5){ MyFancyConnection.new }
# add a request to save data for each successful call on a connection
# will save the return value inside a hash on the key ':affected_rows'
# and make it available for the fiber that made the call
pool.save_data(:affected_rows) do |connection, method, args|
connection.affected_rows
end
Then from our fiber:
pool.query_me('affecting 5 rows right now')
# recover gathered data for this fiber
puts pool.gathered_data
=> { :affected_rows => 5 }
You must access the gathered data from the same fiber that triggered its gathering.
Also any new call to query_me
or any other method from the connection would execute the block again,
overwriting that position on the hash (unless you code to prevent it, of course).
Usually you would use the gathered data
right after you made the query that generated it. But you could:
# save only the first run
pool.save_data(:affected_rows) do |connection, method, args|
pool.gathered_data[:affected_rows] || connection.affected_rows
end
You can define as much save_data
blocks as you want, and run any wonder ruby lets you.
But great power comes with great responsability.
You must consider that any requests for saving data are executed for every call on the pool from that fiber.
So keep it stupid simple, and blindly fast. At least as much as you can. That would affect performance otherwise.
Any gathered_data is released when the fiber is dead, but as you must access it from the fiber itself, worry should not.
Manual acquire
Sometimes you may need to execute a sequence of methods on the same instance. Then you should use manually acquire the connection from the pool. But then you are entirely responsible of releasing it back again into the pool. See this example:
def transaction
@pool.acquire # reserve one instance for this fiber
@pool.query 'BEGIN' # start SQL transaction
yield # perform queries inside the transaction
@pool.query 'COMMIT' # confirm it
rescue => ex
@pool.query 'ROLLBACK' # discard it
raise ex
ensure
@pool.release # always release it back
end
transaction do
@pool.query 'UPDATE ...'
@pool.query 'SELECT ...'
end
When you call acquire
, one connection will be taken out of the pool and reserved for exclusive use
of the current fiber. Every call you make to the pool from this fiber will be using the same connection instance,
until release
is called. Then it's put back into the pool and made available for the other fibers.
If for some reason release
is not called, then the connection will remain unavailable for the other
fibers until the death of the fiber that acquired it. Then it's returned to the pool. That's a garbage
collecting mechanism, not to rely on for performance. You should definitely ensure you call release
.
Notice that when you use with_failed_connection
you may lose the actual instance.
Remember that with_failed_connection
replaces the failing connection with the return value of the given block.
Only if you return the same instance you will not lose it.
Sometimes even that will not be enough, just look at the transaction
example.
If anything raises before the COMMIT
, it's not so easy to avoid being forced
to start the whole transaction all over again, whether you lost the actual instance or not.
Supported Platforms
Used in production environments on Ruby 1.9.3, 2.0 and 2.1. Tested against Ruby 1.9.3, 2.0 and 2.1 (See details..). It should work on any platform implementing fibers. There's no further magic involved.
More to come !
See issues