Do you believe in connection pooling?
David hayden's post about explicitly opening SqlConnection objects when doing multiple SqlDataAdapter.Fill calls didn't sit well with me. The premise is that if you are going to make multiple calls to Fill(), you should open the connection first yourself, to avoid Fill() opening and closing the connection for each call. I'll quote his conclusion:
"...in the case of multiple back-to-back Fill requests to the DataAdapter, it is more performant to explicitly open the connection in the beginning so that each Fill request on the DataAdapter does not open and close the database connection, resulting in the database connection being opened and closed several times."
The reason it doesn't sit well with me, is that I believe in connection pooling. I don't know all of the gory details of how it works under the hood, but I believe it exists. I believe that when I call SqlConnection.Open(), I am not necessarily creating a new network connection to the database - I may just be re-using an existing connection that is sitting idle in the pool.
But he was quoting Sahil's Pro ADO.NET 2.0, and I have no doubt that Sahil knows more about ADO and SQL Server than I do, so I started to doubt my belief. Maybe connection pooling really DOESN'T exist! Time to get some cold hard facts, (in the form of completely un-scientific homebrew benchmarks).
The test harness:
static void Main(string[] args)
{
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
for (int i = 0; i < 1000; i++)
{
ExplicitOpen();
//ImplicitOpen();
}
sw.Stop();
Console.WriteLine(sw.Elapsed);
}
Very simple. Run one of the methods a thousand times, and report the cumulative elapsed time. Comment it out, un-comment the other one, and re-run.
The code for the two different methods are almost exactly the same. I only made very minor changes to the code from David's post, to account for the fact that I didn't have the pubs database handy (I used the IssueVision database from the DevDays 2004 sample app).
Here is the code for ExplicitOpen():
static void ExplicitOpen()
{
using (SqlConnection connection = new SqlConnection("data source=(local);initial catalog=issuevision;integrated security=true;"))
{
SqlCommand sqlCat = connection.CreateCommand();
sqlCat.CommandText = "Select * from issues";
SqlCommand sqlProd = connection.CreateCommand();
sqlProd.CommandText = "Select * from staffers";
SqlDataAdapter adapter = new SqlDataAdapter();
adapter.SelectCommand = sqlCat;
DataSet dataset = new DataSet();
connection.Open();
adapter.Fill(dataset, "Issues");
adapter.SelectCommand = sqlProd;
adapter.Fill(dataset, "Staffers");
connection.Close();
}
}
ImplicitOpen() is exactly the same, only the connection.Open() and connection.Close() lines are removed.
Running ExplicitOpen() 1000 times took an average of 2.5 seconds. Running ImplicitOpen() 1000 times took an everage of 2.9 seconds. I ran them both multiple times and the numbers were always consistent, so I think they are somewhat believable.
How about that. Opening the connection explicitly IS a bit faster. So does that mean connection pooling doesn't exist?
I opened the Profiler tool included with SQL Server 2000 and created a connection to my database with the default profiling template. I removed the looping in the main method so that the method would only be called once. This is what I saw when I ran ExplicitOpen():
Audit Login
SQL:BatchCompleted Select * from issues
SQL:BatchCompleted Select * from staffers
Audit Logout
This is what I saw when I ran ImplicitOpen():
Audit Login
SQL:BatchCompleted Select * from issues
RPC:Completed exec sp_reset_connection
SQL:BatchCompleted Select * from staffers
Audit Logout
Aha! As you can see, there is still only 1 connection being made (a single Audit Login and Audit Logout). So it looks like we're both right. There is always 1 connection, but explictly opening is still slightly faster. The difference is the call to sp_reset_connection (which is undocumented in Books Online) which cleans up the connection so it can be re-used.
So, if I could get rid of that call to sp_reset_connection, could I get the same performance as ExplicitOpen()? I copied ImplicitOpen() to a new method and named it ImplicitOpenNoReset(). The only change I made was appending "connection reset=false;" to the end of the connection string. Running 1000 iterations took 2.6 seconds. Much closer to ExplicitOpen(), but there is still a bit of overhead. (Warning: I have no idea how safe or advisable it is to disable the connection reset - do not interpret this as a recommendation for improving your connection pooling performance for general usage - it was just for the purposes of this test.)
Finally, to prove to myself that connection pooling really does exist, and really does help, I created a new copy of ImplicitOpen() named NoConnectionPooling(). The only difference is the connection string includes "pooling=false;". Running 1000 iterations took... 7.5 seconds! Ouch.
My take away is that connection pooling really exists, it really does help you, and in general, you do not have to worry about maximing the use of an open SqlConnection object by keeping it around for multiple queries. But if you have to squeeze out the absolute best performance possible, you will get a slight gain by calling Open() once when you have consecutive queries. Thanks to David for a thought-provoking post.
Update: I discovered a flaw in my benchmarking approach which skewed some of these results. I wrote up the details in a new post.
Comments
We are using several data adapters and we are not explicitly opening and closing the connections. Now we are getting an error like this.
connections were in use and max pool size was reached
Will there be any remote possibility of getting the above error because we are not opening and closing the connection explicitly.
Thanking you
Nirmala
The first thing I would look for is if any other code in your application is creating connection objects and not closing/disposing them as soon as possible.