10.2. Pooling Connections with DBCP
Database operations are typically pretty fast, with the notable exception of establishing the initial connection. This is because there are many steps to establishing such a connection, including setting up the raw network connection, sending and verifying user authentication information, and so on.In-process hsqldb usage avoids this overhead by not using a connection at all, but at the cost of restricting the database to a single process. For databases that must be used simultaneously by a number of people or programs, a server is necessary, and hence so is the need to make the connection. For desktop applications this penalty is not too severe because it is paid only once when the application first starts up. However, for Web applications a connection to the database must often be opened for every request, and the time needed to do so may noticeably impact the responsiveness of the site to the end user.The solution to this problem is a technique called pooling. The idea of pooling predates JDBC and even the Web; it is applicable whenever a program will need many instances of some expensive resource. The general idea is to allocate as many instances of the resource as are likely to be needed early in the program's life cycle. These instances are placed in a pool. When one instance is needed, it is removed from the pool, and when the instance has served its purpose, it is returned to the pool, ready for the next use.In the case of databases, the issue of establishing a connection can be reconsidered as the time needed to obtain a Connection object from the DriverManager. A number of Connections could therefore be obtained early and handed out as needed.To illustrate these concepts further, Listing 10.4 demonstrates a simple pooling class.
Listing 10.4. A simple connection pool
[1]or the call can hang until an instance is available, or it can fail and throw a NoSuchElementException.A pool can also be told to evict objects that have been idle for a specified period of time. There are also parameters that control the minimum time after which an object may be eligible for eviction and the maximum number of idle objects that are allowed to be in the pool before some are considered for eviction.The pool can also test objects to ensure that they are valid and can forcibly evict any objects that have "gone bad." There are many reasons why a connection may go bad; for example, some database servers may shut down connections due to inactivity. Pool can run these validity tests when an object is handed out, when an object is returned to the pool, or periodically.Fine-tuning all these parameters may take some effort and experimentation, and the correct values will depend heavily on the details of the application. In a typical Web application each use of the database is likely to be short-lived; often just a few queries are needed to render a page. In this case the number of objects in the pool will closely parallel the number of simultaneous users of the site. Some allowance must be made for peak times when the number of users may jump significantly. Ideally users should never have to wait for a response, which would seem to imply that the pool should be allowed to grow as needed. However, after a certain point this would slow down performance for everyone, so a better solution may be to allow the pool to fail, resulting in a notice to the user that the site is busy and he or she should come back later. The issue of turning away users beyond a certain threshold was also addressed in the "How to Test" section in Chapter 7, and indeed JMeter could be used to ensure that Pool is behaving properly with an application.If a single application server is accessing a single database, there is no great rush to release idle connections, although they should be released eventually. However, if there are multiple instances of the application running simultaneously, and they all access a central database, then idle connections should be released sooner so the other application servers can use them if needed.Again, getting this all correct may take time. A tool like JMeter can be a great asset here by making it easy to test the performance implications of various pool configurations.There are constructors available that set various combinations of parameters, or they can be set (and changed) through accessors after the pool is constructed. The only parameter that the pool absolutely requires is the factory; one is usually provided in the constructor. In the following examples the factory is provided to the pool in a later step, so building the pool is as simple as
package com.awl.toolbook.chapter10;
import java.sql.*;
public class SimplePool {
private static int NUM_CONNECTIONS=20;
private Connection connections[];
private boolean inUse[];
public SimplePool(String dbclass,String url) {
connections = new Connection[NUM_CONNECTIONS];
inUse = new boolean[NUM_CONNECTIONS];
try {
Class.forName(dbclass).newInstance();
for (int i=0;i<NUM_CONNECTIONS;i++) {
connections[i] =
DriverManager.getConnection(url);
inUse[i] = false;
}
} catch (Exception e) {
System.err.println("Unable to build pool: " +
e);
}
}
public synchronized Connection getConnection() {
for (int i=0;i<NUM_CONNECTIONS;i++) {
if(!inUse[i]) {
inUse[i] = true;
return connections[i];
}
}
return null;
}
public synchronized void releaseConnection(
Connection c)
{
for (int i=0;i<NUM_CONNECTIONS;i++) {
if(connections[i].equals(c)) {
inUse[i] = false;
return;
}
}
}
}
ObjectPool thePool = new GenericObjectPool();The next step is to create the factory. This proceeds in two parts. The first is to create a factory that will return a new Connection or DataSource, which can be done with
or new DataSourceConnectionFactory() with the same arguments. Here the factory is told to use a server instance of hsqld because it makes no sense to pool connections to the in-process version.connFactory could now be used to obtain new connections, essentially taking over DriverManager's job
ConnectionFactory connFactory =
new DriverManagerConnectionFactory("jdbc:hsqldb:....",
"sa",
");
Connection c = connFactory.createConnection();However, there is no reason to make such a call directly as this would provide none of the needed pooling functionality. Therefore, the second step of creating the factory is to build a poolable factory from the connection factory.
Statement s = c.createStatement(...);
...
c.close();
PoolableConnectionFactory poolFactory =The first and second arguments to the constructor are the base connection factory and the pool, respectively. The third argument is another factory that may be used to pool prepared statements. This will be discussed in more detail shortly. The fourth argument is a string representing a validation query. If this is not null, it will enable the testing mechanism of the underlying pool. The final two arguments are booleans indicating whether connections should be read-only and whether changes should be committed automatically. Auto commit should be set to false if an application will be doing any sort of complex transaction processing.Recall that thePool needs a factory, which was not provided when the pool was initially constructed. This is handled by the poolFactory constructor, which will register the factory with thePool by calling thePool.setFactory(this). Developers uncomfortable with this kind of side effect can explicitly call this method, although beyond clarity there is no real motivation to do so.This factory could in principle be invoked directly by calling makeObject(), which returns a poolable object that wraps the underlying connection. However, this is not the intended usage, and for all practical purposes the factory should be managed only by the pool.The final step is to register the pool and associate it with a name. This is done through a class called PoolingDriver, which acts as an intermediary between the pool API and JDBC.
new PoolableConnectionFactory(
connFactory,
thePool,
null,
null,
false,
true);
PoolingDriver driver = new PoolingDriver();This will register the URL "jdbc:apache:commons:dbcp:toolbook" with JDBC. Subsequently when the DriverManager is asked for a connection with this name, it will come from the pool. The pool may fulfill this request by returning an idle connection if one is available or going to the factory if one is not. It is now almost trivially easy to use a connection from the pool:
driver.registerPool("toolbook",thePool);
It was stated previously that the creation of the pool could be done through a configuration file, in addition to programmatically. Before examining this configuration file, note that most of the programmatic steps could be done in a single statement, although this statement would be horribly long and complex. Most of the constructed objects are used only as arguments in a subsequent constructors, so it would be possible to do without a lot of the intermediate variables. For example, rather than pass connFactory as the second argument when constructing the PoolableConnectionFactory it would have been possible to use the call to the constructor.Once the code had been reduced to a series of nested constructors, introspection could be used to map an XML expression to the arguments. This is the idea behind Pool's configuration files, which describe an object tree as an XML tree giving constructor information at each node. The XML document that exactly recreates the previous pool built up programmatically is shown in Listing 10.5.
Connection conn =
DriverManager.getConnection(
"jdbc:apache:commons:dbcp:toolbook");
Statement s = conn.createStatement("select * from artist");
... execute the statement, get result set, etc ...
conn.close();
Listing 10.5. The configuration file
Note that most of the objects encountered when building the pool by hand are present in Listing 10.5. Note, too, that the nested-constructor approach Pool uses for its configuration explains the slightly odd behavior of the PoolableConnection Factory with respect to taking the pool as an argument rather than doing the more natural thing and requiring that the factory be constructed first and then passed to the pool's constructor.Assuming this file is installed in one of the directories in the CLASSPATH as "config/toolbook.jocl," a database connection could be obtained with
<object
xmlns="http://apache.org/xml/xmlns/jakarta/commons/jocl">
<!-- The ConnectionFactory -->
<object class=
"org.apache.commons.dbcp.DriverManagerConnectionFactory">
<!-- The raw JDBC url for each connection -->
<string value="jdbc:some:connect:string"/>
<!-- The username -->
<string value="sa"/>
<!-- The password -->
<string value="/>
</object>
<!-- Second arg for PoolableConnectionFactory: -->
<!-- the ObjectPool -->
<object
>
<!-- The ObjectFactory: none -->
<object
null="true"/>
</object>
<!-- end of the second arg for PoolableConnectionFactory -->
<!-- Third argument for PoolableConnectionFactory: -->
<!-- the KeyedObjectPoolFactory -->
<object
null="true"/>
</object>
<!-- Fourth argument for PoolableConnectionFactory: -->
<!-- Query to use for validation -->
<string null="true"/>
<!-- Fifth argument for PoolableConnectionFactory: -->
<!-- default read only -->
<boolean value="false"/>
<!-- Sixth argument for PoolableConnectionFactory: -->
<!-- default auto commit -->
<boolean value="true"/>
</object>
Finally, it was mentioned that statements could be cached along with connections. This may be worth doing depending on how the database client code handles statements and how statements are constructed by the application. If the database client code simply transmits queries in plain text to the backend, and the application only sends static queries, then there is nothing to be gained by pooling statements. On the other hand, if the client code does heavy parsing and preprocessing of queries, which will often be the case when using PreparedStatements, then pooling can be a significant time saver. Pooling statements is clearly different from pooling connections because all connections are identical, but when a particular statement is needed, no other will do. Therefore, a different kind of pool is needed.Statements can be pooled easily, by passing
Connection conn =
DriverManager.getConnection(
"jdbc:apache:commons:dbcp:/config/toolbook.jocl");
as the third argument in the constructor for PoolableConnectionFactory; everything else is handled automatically. The "keyed" in the name refers to the fact that each entry in the pool is identified by a key, which is what allows the system to retrieve a particular statement from the pool when it is needed.
new StackKeyedObjectPoolFactory()