A connection pool is a standard technique used to maintain long running connections in memory for efficient re-use, as well as to provide management for the total number of connections an application might use simultaneously.
Particularly for server-side web applications, a connection pool is the standard way to maintain a “pool” of active database connections in memory which are reused across requests.
SQLAlchemy includes several connection pool implementations
which integrate with the Engine
. They can also be used
directly for applications that want to add pooling to an otherwise
plain DBAPI approach.
The Engine
returned by the
create_engine()
function in most cases has a QueuePool
integrated, pre-configured with reasonable pooling defaults. If
you’re reading this section only to learn how to enable pooling - congratulations!
You’re already done.
The most common QueuePool
tuning parameters can be passed
directly to create_engine()
as keyword arguments:
pool_size
, max_overflow
, pool_recycle
and
pool_timeout
. For example:
engine = create_engine('postgresql://me@localhost/mydb',
pool_size=20, max_overflow=0)
In the case of SQLite, the SingletonThreadPool
or
NullPool
are selected by the dialect to provide
greater compatibility with SQLite’s threading and locking
model, as well as to provide a reasonable default behavior
to SQLite “memory” databases, which maintain their entire
dataset within the scope of a single connection.
All SQLAlchemy pool implementations have in common
that none of them “pre create” connections - all implementations wait
until first use before creating a connection. At that point, if
no additional concurrent checkout requests for more connections
are made, no additional connections are created. This is why it’s perfectly
fine for create_engine()
to default to using a QueuePool
of size five without regard to whether or not the application really needs five connections
queued up - the pool would only grow to that size if the application
actually used five connections concurrently, in which case the usage of a
small pool is an entirely appropriate default behavior.
The usual way to use a different kind of pool with create_engine()
is to use the poolclass
argument. This argument accepts a class
imported from the sqlalchemy.pool
module, and handles the details
of building the pool for you. Common options include specifying
QueuePool
with SQLite:
from sqlalchemy.pool import QueuePool
engine = create_engine('sqlite:///file.db', poolclass=QueuePool)
Disabling pooling using NullPool
:
from sqlalchemy.pool import NullPool
engine = create_engine(
'postgresql+psycopg2://scott:tiger@localhost/test',
poolclass=NullPool)
All Pool
classes accept an argument creator
which is
a callable that creates a new connection. create_engine()
accepts this function to pass onto the pool via an argument of
the same name:
import sqlalchemy.pool as pool
import psycopg2
def getconn():
c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
# do things with 'c' to set up
return c
engine = create_engine('postgresql+psycopg2://', creator=getconn)
For most “initialize on connection” routines, it’s more convenient
to use the PoolEvents
event hooks, so that the usual URL argument to
create_engine()
is still usable. creator
is there as
a last resort for when a DBAPI has some form of connect
that is not at all supported by SQLAlchemy.
To use a Pool
by itself, the creator
function is
the only argument that’s required and is passed first, followed
by any additional options:
import sqlalchemy.pool as pool
import psycopg2
def getconn():
c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
return c
mypool = pool.QueuePool(getconn, max_overflow=10, pool_size=5)
DBAPI connections can then be procured from the pool using the Pool.connect()
function. The return value of this method is a DBAPI connection that’s contained
within a transparent proxy:
# get a connection
conn = mypool.connect()
# use it
cursor = conn.cursor()
cursor.execute("select foo")
The purpose of the transparent proxy is to intercept the close()
call,
such that instead of the DBAPI connection being closed, it is returned to the
pool:
# "close" the connection. Returns
# it to the pool.
conn.close()
The proxy also returns its contained DBAPI connection to the pool when it is garbage collected, though it’s not deterministic in Python that this occurs immediately (though it is typical with cPython).
The close()
step also performs the important step of calling the
rollback()
method of the DBAPI connection. This is so that any
existing transaction on the connection is removed, not only ensuring
that no existing state remains on next usage, but also so that table
and row locks are released as well as that any isolated data snapshots
are removed. This behavior can be disabled using the reset_on_return
option of Pool
.
A particular pre-created Pool
can be shared with one or more
engines by passing it to the pool
argument of create_engine()
:
e = create_engine('postgresql://', pool=mypool)
Connection pools support an event interface that allows hooks to execute
upon first connect, upon each new connection, and upon checkout and
checkin of connections. See PoolEvents
for details.
The connection pool has the ability to refresh individual connections as well as its entire set of connections, setting the previously pooled connections as “invalid”. A common use case is allow the connection pool to gracefully recover when the database server has been restarted, and all previously established connections are no longer functional. There are two approaches to this.
The pessimistic approach refers to emitting a test statement on the SQL connection at the start of each connection pool checkout, to test that the database connection is still viable. Typically, this is a simple statement like “SELECT 1”, but may also make use of some DBAPI-specific method to test the connection for liveness.
The approach adds a small bit of overhead to the connection checkout process, however is otherwise the most simple and reliable approach to completely eliminating database errors due to stale pooled connections. The calling application does not need to be concerned about organizing operations to be able to recover from stale connections checked out from the pool.
It is critical to note that the pre-ping approach does not accommodate for
connections dropped in the middle of transactions or other SQL operations.
If the database becomes unavailable while a transaction is in progress, the
transaction will be lost and the database error will be raised. While
the Connection
object will detect a “disconnect” situation and
recycle the connection as well as invalidate the rest of the connection pool
when this condition occurs,
the individual operation where the exception was raised will be lost, and it’s
up to the application to either abandon
the operation, or retry the whole transaction again.
Pessimistic testing of connections upon checkout is achievable by
using the Pool.pre_ping
argument, available from create_engine()
via the create_engine.pool_pre_ping
argument:
engine = create_engine("mysql+pymysql://user:pw@host/db", pool_pre_ping=True)
The “pre ping” feature will normally emit SQL equivalent to “SELECT 1” each time a connection is checked out from the pool; if an error is raised that is detected as a “disconnect” situation, the connection will be immediately recycled, and all other pooled connections older than the current time are invalidated, so that the next time they are checked out, they will also be recycled before use.
If the database is still not available when “pre ping” runs, then the initial connect will fail and the error for failure to connect will be propagated normally. In the uncommon situation that the database is available for connections, but is not able to respond to a “ping”, the “pre_ping” will try up to three times before giving up, propagating the database error last received.
Note
the “SELECT 1” emitted by “pre-ping” is invoked within the scope of the connection pool / dialect, using a very short codepath for minimal Python latency. As such, this statement is not logged in the SQL echo output, and will not show up in SQLAlchemy’s engine logging.
New in version 1.2: Added “pre-ping” capability to the Pool
class.
Before create_engine.pool_pre_ping
was added, the “pre-ping”
approach historically has been performed manually using
the ConnectionEvents.engine_connect()
engine event.
The most common recipe for this is below, for reference
purposes in case an application is already using such a recipe, or special
behaviors are needed:
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy import select
some_engine = create_engine(...)
@event.listens_for(some_engine, "engine_connect")
def ping_connection(connection, branch):
if branch:
# "branch" refers to a sub-connection of a connection,
# we don't want to bother pinging on these.
return
# turn off "close with result". This flag is only used with
# "connectionless" execution, otherwise will be False in any case
save_should_close_with_result = connection.should_close_with_result
connection.should_close_with_result = False
try:
# run a SELECT 1. use a core select() so that
# the SELECT of a scalar value without a table is
# appropriately formatted for the backend
connection.scalar(select([1]))
except exc.DBAPIError as err:
# catch SQLAlchemy's DBAPIError, which is a wrapper
# for the DBAPI's exception. It includes a .connection_invalidated
# attribute which specifies if this connection is a "disconnect"
# condition, which is based on inspection of the original exception
# by the dialect in use.
if err.connection_invalidated:
# run the same SELECT again - the connection will re-validate
# itself and establish a new connection. The disconnect detection
# here also causes the whole connection pool to be invalidated
# so that all stale connections are discarded.
connection.scalar(select([1]))
else:
raise
finally:
# restore "close with result"
connection.should_close_with_result = save_should_close_with_result
The above recipe has the advantage that we are making use of SQLAlchemy’s
facilities for detecting those DBAPI exceptions that are known to indicate
a “disconnect” situation, as well as the Engine
object’s ability
to correctly invalidate the current connection pool when this condition
occurs and allowing the current Connection
to re-validate onto
a new DBAPI connection.
When pessimistic handling is not employed, as well as when the database is
shutdown and/or restarted in the middle of a connection’s period of use within
a transaction, the other approach to dealing with stale / closed connections is
to let SQLAlchemy handle disconnects as they occur, at which point all
connections in the pool are invalidated, meaning they are assumed to be
stale and will be refreshed upon next checkout. This behavior assumes the
Pool
is used in conjunction with a Engine
.
The Engine
has logic which can detect
disconnection events and refresh the pool automatically.
When the Connection
attempts to use a DBAPI connection, and an
exception is raised that corresponds to a “disconnect” event, the connection
is invalidated. The Connection
then calls the Pool.recreate()
method, effectively invalidating all connections not currently checked out so
that they are replaced with new ones upon next checkout. This flow is
illustrated by the code example below:
from sqlalchemy import create_engine, exc
e = create_engine(...)
c = e.connect()
try:
# suppose the database has been restarted.
c.execute("SELECT * FROM table")
c.close()
except exc.DBAPIError, e:
# an exception is raised, Connection is invalidated.
if e.connection_invalidated:
print("Connection was invalidated!")
# after the invalidate event, a new connection
# starts with a new Pool
c = e.connect()
c.execute("SELECT * FROM table")
The above example illustrates that no special intervention is needed to refresh the pool, which continues normally after a disconnection event is detected. However, one database exception is raised, per each connection that is in use while the database unavailability event occurred. In a typical web application using an ORM Session, the above condition would correspond to a single request failing with a 500 error, then the web application continuing normally beyond that. Hence the approach is “optimistic” in that frequent database restarts are not anticipated.
An additional setting that can augment the “optimistic” approach is to set the pool recycle parameter. This parameter prevents the pool from using a particular connection that has passed a certain age, and is appropriate for database backends such as MySQL that automatically close connections that have been stale after a particular period of time:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger@localhost/test", pool_recycle=3600)
Above, any DBAPI connection that has been open for more than one hour will be invalidated and replaced,
upon next checkout. Note that the invalidation only occurs during checkout - not on
any connections that are held in a checked out state. pool_recycle
is a function
of the Pool
itself, independent of whether or not an Engine
is in use.
The Pool
provides “connection invalidation” services which allow
both explicit invalidation of a connection as well as automatic invalidation
in response to conditions that are determined to render a connection unusable.
“Invalidation” means that a particular DBAPI connection is removed from the
pool and discarded. The .close()
method is called on this connection
if it is not clear that the connection itself might not be closed, however
if this method fails, the exception is logged but the operation still proceeds.
When using a Engine
, the Connection.invalidate()
method is
the usual entrypoint to explicit invalidation. Other conditions by which
a DBAPI connection might be invalidated include:
OperationalError
, raised when a
method like connection.execute()
is called, is detected as indicating
a so-called “disconnect” condition. As the Python DBAPI provides no
standard system for determining the nature of an exception, all SQLAlchemy
dialects include a system called is_disconnect()
which will examine
the contents of an exception object, including the string message and
any potential error codes included with it, in order to determine if this
exception indicates that the connection is no longer usable. If this is the
case, the _ConnectionFairy.invalidate()
method is called and the
DBAPI connection is then discarded.connection.rollback()
or connection.commit()
methods,
as dictated by the pool’s “reset on return” behavior, throws an exception.
A final attempt at calling .close()
on the connection will be made,
and it is then discarded.PoolEvents.checkout()
raises the
DisconnectionError
exception, indicating that the connection
won’t be usable and a new connection attempt needs to be made.All invalidations which occur will invoke the PoolEvents.invalidate()
event.
The QueuePool
class features a flag called
QueuePool.use_lifo
, which can also be accessed from
create_engine()
via the flag create_engine.pool_use_lifo
.
Setting this flag to True
causes the pool’s “queue” behavior to instead be
that of a “stack”, e.g. the last connection to be returned to the pool is the
first one to be used on the next request. In contrast to the pool’s long-
standing behavior of first-in-first-out, which produces a round-robin effect of
using each connection in the pool in series, lifo mode allows excess
connections to remain idle in the pool, allowing server-side timeout schemes to
close these connections out. The difference between FIFO and LIFO is
basically whether or not its desirable for the pool to keep a full set of
connections ready to go even during idle periods:
engine = create_engine(
"postgreql://", pool_use_lifo=True, pool_pre_ping=True)
Above, we also make use of the create_engine.pool_pre_ping
flag
so that connections which are closed from the server side are gracefully
handled by the connection pool and replaced with a new connection.
Note that the flag only applies to QueuePool
use.
New in version 1.3.
See also
It’s critical that when using a connection pool, and by extension when
using an Engine
created via create_engine()
, that
the pooled connections are not shared to a forked process. TCP connections
are represented as file descriptors, which usually work across process
boundaries, meaning this will cause concurrent access to the file descriptor
on behalf of two or more entirely independent Python interpreter states.
There are two approaches to dealing with this.
The first is, either create a new Engine
within the child
process, or upon an existing Engine
, call Engine.dispose()
before the child process uses any connections. This will remove all existing
connections from the pool so that it makes all new ones. Below is
a simple version using multiprocessing.Process
, but this idea
should be adapted to the style of forking in use:
engine = create_engine("...")
def run_in_process():
engine.dispose()
with engine.connect() as conn:
conn.execute("...")
p = Process(target=run_in_process)
The next approach is to instrument the Pool
itself with events
so that connections are automatically invalidated in the subprocess.
This is a little more magical but probably more foolproof:
from sqlalchemy import event
from sqlalchemy import exc
import os
engine = create_engine("...")
@event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
connection_record.info['pid'] = os.getpid()
@event.listens_for(engine, "checkout")
def checkout(dbapi_connection, connection_record, connection_proxy):
pid = os.getpid()
if connection_record.info['pid'] != pid:
connection_record.connection = connection_proxy.connection = None
raise exc.DisconnectionError(
"Connection record belongs to pid %s, "
"attempting to check out in pid %s" %
(connection_record.info['pid'], pid)
)
Above, we use an approach similar to that described in Disconnect Handling - Pessimistic to treat a DBAPI connection that originated in a different parent process as an “invalid” connection, coercing the pool to recycle the connection record to make a new connection.
sqlalchemy.pool.
Pool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, dialect=None, pre_ping=False, _dispatch=None)¶Bases: sqlalchemy.log.Identified
Abstract base class for connection pools.
__init__
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, dialect=None, pre_ping=False, _dispatch=None)¶Construct a Pool.
Parameters: |
|
---|
connect
()¶Return a DBAPI connection from the pool.
The connection is instrumented such that when its
close()
method is called, the connection will be returned to
the pool.
dispose
()¶Dispose of this pool.
This method leaves the possibility of checked-out connections remaining open, as it only affects connections that are idle in the pool.
See also
recreate
()¶Return a new Pool
, of the same class as this one
and configured with identical creation arguments.
This method is used in conjunction with dispose()
to close out an entire Pool
and create a new one in
its place.
unique_connection
()¶Produce a DBAPI connection that is not referenced by any thread-local context.
This method is equivalent to Pool.connect()
when the
Pool.use_threadlocal
flag is not set to True.
When Pool.use_threadlocal
is True, the
Pool.unique_connection()
method provides a means of bypassing
the threadlocal context.
sqlalchemy.pool.
QueuePool
(creator, pool_size=5, max_overflow=10, timeout=30, use_lifo=False, **kw)¶Bases: sqlalchemy.pool.base.Pool
A Pool
that imposes a limit on the number of open connections.
QueuePool
is the default pooling implementation used for
all Engine
objects, unless the SQLite dialect is in use.
__init__
(creator, pool_size=5, max_overflow=10, timeout=30, use_lifo=False, **kw)¶Construct a QueuePool.
Parameters: |
|
---|
connect
()¶connect()
method of Pool
Return a DBAPI connection from the pool.
The connection is instrumented such that when its
close()
method is called, the connection will be returned to
the pool.
unique_connection
()¶unique_connection()
method of Pool
Produce a DBAPI connection that is not referenced by any thread-local context.
This method is equivalent to Pool.connect()
when the
Pool.use_threadlocal
flag is not set to True.
When Pool.use_threadlocal
is True, the
Pool.unique_connection()
method provides a means of bypassing
the threadlocal context.
sqlalchemy.pool.
SingletonThreadPool
(creator, pool_size=5, **kw)¶Bases: sqlalchemy.pool.base.Pool
A Pool that maintains one connection per thread.
Maintains one connection per each thread, never moving a connection to a thread other than the one which it was created in.
Warning
the SingletonThreadPool
will call .close()
on arbitrary connections that exist beyond the size setting of
pool_size
, e.g. if more unique thread identities
than what pool_size
states are used. This cleanup is
non-deterministic and not sensitive to whether or not the connections
linked to those thread identities are currently in use.
SingletonThreadPool
may be improved in a future release,
however in its current status it is generally used only for test
scenarios using a SQLite :memory:
database and is not recommended
for production use.
Options are the same as those of Pool
, as well as:
Parameters: | pool_size¶ – The number of threads in which to maintain connections at once. Defaults to five. |
---|
SingletonThreadPool
is used by the SQLite dialect
automatically when a memory-based database is used.
See SQLite.
__init__
(creator, pool_size=5, **kw)¶Construct a Pool.
Parameters: |
|
---|
sqlalchemy.pool.
AssertionPool
(*args, **kw)¶Bases: sqlalchemy.pool.base.Pool
A Pool
that allows at most one checked out connection at
any given time.
This will raise an exception if more than one connection is checked out at a time. Useful for debugging code that is using more connections than desired.
sqlalchemy.pool.
NullPool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, dialect=None, pre_ping=False, _dispatch=None)¶Bases: sqlalchemy.pool.base.Pool
A Pool which does not pool connections.
Instead it literally opens and closes the underlying DB-API connection per each connection open/close.
Reconnect-related functions such as recycle
and connection
invalidation are not supported by this Pool implementation, since
no connections are held persistently.
sqlalchemy.pool.
StaticPool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, dialect=None, pre_ping=False, _dispatch=None)¶Bases: sqlalchemy.pool.base.Pool
A Pool of exactly one connection, used for all requests.
Reconnect-related functions such as recycle
and connection
invalidation (which is also used to support auto-reconnect) are not
currently supported by this Pool implementation but may be implemented
in a future release.
sqlalchemy.pool.
_ConnectionFairy
(dbapi_connection, connection_record, echo)¶Proxies a DBAPI connection and provides return-on-dereference support.
This is an internal object used by the Pool
implementation
to provide context management to a DBAPI connection delivered by
that Pool
.
The name “fairy” is inspired by the fact that the
_ConnectionFairy
object’s lifespan is transitory, as it lasts
only for the length of a specific DBAPI connection being checked out from
the pool, and additionally that as a transparent proxy, it is mostly
invisible.
See also
_connection_record
= None¶A reference to the _ConnectionRecord
object associated
with the DBAPI connection.
This is currently an internal accessor which is subject to change.
connection
= None¶A reference to the actual DBAPI connection being tracked.
cursor
(*args, **kwargs)¶Return a new DBAPI cursor for the underlying connection.
This method is a proxy for the connection.cursor()
DBAPI
method.
detach
()¶Separate this connection from its Pool.
This means that the connection will no longer be returned to the pool when closed, and will instead be literally closed. The containing ConnectionRecord is separated from the DB-API connection, and will create a new connection when next used.
Note that any overall connection limiting constraints imposed by a Pool implementation may be violated after a detach, as the detached connection is removed from the pool’s knowledge and control.
info
¶Info dictionary associated with the underlying DBAPI connection
referred to by this ConnectionFairy
, allowing user-defined
data to be associated with the connection.
The data here will follow along with the DBAPI connection including
after it is returned to the connection pool and used again
in subsequent instances of _ConnectionFairy
. It is shared
with the _ConnectionRecord.info
and Connection.info
accessors.
The dictionary associated with a particular DBAPI connection is discarded when the connection itself is discarded.
invalidate
(e=None, soft=False)¶Mark this connection as invalidated.
This method can be called directly, and is also called as a result
of the Connection.invalidate()
method. When invoked,
the DBAPI connection is immediately closed and discarded from
further use by the pool. The invalidation mechanism proceeds
via the _ConnectionRecord.invalidate()
internal method.
Parameters: |
---|
See also
is_valid
¶Return True if this _ConnectionFairy
still refers
to an active DBAPI connection.
record_info
¶Info dictionary associated with the _ConnectionRecord
container referred to by this :class:
.ConnectionFairy`.
Unlike the _ConnectionFairy.info
dictionary, the lifespan
of this dictionary is persistent across connections that are
disconnected and/or invalidated within the lifespan of a
_ConnectionRecord
.
New in version 1.1.
sqlalchemy.pool.
_ConnectionRecord
(pool, connect=True)¶Internal object which maintains an individual DBAPI connection
referenced by a Pool
.
The _ConnectionRecord
object always exists for any particular
DBAPI connection whether or not that DBAPI connection has been
“checked out”. This is in contrast to the _ConnectionFairy
which is only a public facade to the DBAPI connection while it is checked
out.
A _ConnectionRecord
may exist for a span longer than that
of a single DBAPI connection. For example, if the
_ConnectionRecord.invalidate()
method is called, the DBAPI connection associated with this
_ConnectionRecord
will be discarded, but the _ConnectionRecord
may be used again,
in which case a new DBAPI connection is produced when the Pool
next uses this record.
The _ConnectionRecord
is delivered along with connection
pool events, including PoolEvents.connect()
and
PoolEvents.checkout()
, however _ConnectionRecord
still
remains an internal object whose API and internals may change.
See also
connection
= None¶A reference to the actual DBAPI connection being tracked.
May be None
if this _ConnectionRecord
has been marked
as invalidated; a new DBAPI connection may replace it if the owning
pool calls upon this _ConnectionRecord
to reconnect.
info
¶The .info
dictionary associated with the DBAPI connection.
This dictionary is shared among the _ConnectionFairy.info
and Connection.info
accessors.
Note
The lifespan of this dictionary is linked to the
DBAPI connection itself, meaning that it is discarded each time
the DBAPI connection is closed and/or invalidated. The
_ConnectionRecord.record_info
dictionary remains
persistent throughout the lifespan of the
_ConnectionRecord
container.
invalidate
(e=None, soft=False)¶Invalidate the DBAPI connection held by this _ConnectionRecord
.
This method is called for all connection invalidations, including
when the _ConnectionFairy.invalidate()
or
Connection.invalidate()
methods are called, as well as when any
so-called “automatic invalidation” condition occurs.
Parameters: |
---|
See also
record_info
¶An “info’ dictionary associated with the connection record itself.
Unlike the _ConnectionRecord.info
dictionary, which is linked
to the lifespan of the DBAPI connection, this dictionary is linked
to the lifespan of the _ConnectionRecord
container itself
and will remain persistent throughout the life of the
_ConnectionRecord
.
New in version 1.1.