0

When inserting data into database using parallel foreach I get the following error:

The connection pool has been exhausted'

after inserting some amount of data into the database

try
{
    var connection = ConfigurationManager.ConnectionStrings["Connection"].ConnectionString;

    Parallel.ForEach(Enumerable.Range(0, 1000), (_) =>
    {
        using (var connectio = new NpgsqlConnection(connection))
        {
            connectio.Open();
            using (var command = new NpgsqlCommand("fn_tetsdata", connectio) { CommandType = CommandType.StoredProcedure })
            {
                command.Parameters.AddWithValue("firstname", "test");
                command.Parameters.AddWithValue("lastname", "test");
                command.Parameters.AddWithValue("id", 10);
                command.Parameters.AddWithValue("designation", "test");

                command.ExecuteNonQuery();
            }
            connectio.Close();
        }
    });
}
catch (Exception ex)
{
    Console.WriteLine(ex.Message);
}
Craig W.
  • 17,838
  • 6
  • 49
  • 82
Tejal Barot
  • 80
  • 3
  • 7
  • 1
    Default connection pool size is 20, so either increase it or reduce the number of connections. – Tim Schmelter Jun 06 '16 at 14:18
  • 2
    I'm just wondering why you would do it like that... You do realize that it's usually better to do a lot of work in a single batch instead of throwing a lot of threads that connect, do a bit of work, disconnect, etc? – atlaste Jun 06 '16 at 14:23
  • I am not convinced the default pool size is 20 - http://stackoverflow.com/questions/18943703/how-many-threads-parallel-forforeach-will-create-default-maxdegreeofparalleli – Murray Foxcroft Jun 06 '16 at 14:29
  • @MurrayFoxcroft He's referring to the _connection_ pool size, not the _thread_ pool size, although I think the default is 100, not 20. In any case, there is absolutely no need to create more than _one_ connection in this scenario. – D Stanley Jun 06 '16 at 14:32
  • You're creating a connection and then creating multiple connections based on that initial connection. Of course you're going to run out connections in the pool. That's not the best way to do this. You only need one connection and simply reuse it. – ManoDestra Jun 06 '16 at 14:32
  • Npgsql's default maximum pool size was 20 until version 3.1, when it was changed to 100 (which is also the default in SqlClient). It's true that endlessly spawning concurrent tasks is a bad idea, you should probably control your concurrency level in Parallel.ForEach. – Shay Rojansky Jun 06 '16 at 14:33
  • 1
    Agreed, he should open a single, then parallelize / batch depending on what yields best perf, though the SQL connection manager should do some of tintelligence here for you. – Murray Foxcroft Jun 06 '16 at 14:34
  • `Parallel.ForEach` is just a sample here. In my real application, I have thousands of records which I have to process asnchronously. Actually the problem is `npgsql` library. Somehow it is not closing connection while disposing the object and that is why pool gets filled and get this error. – Tejal Barot Jun 07 '16 at 08:34

2 Answers2

6

Constrain the amount of parralelism with MaxDegreeOfParallelism, by default it could be exceeding the number of DB connections you have. Find a balance between parallelising your work and not killing the DB :)

Parallel.ForEach(yourListOfStuff,
    new ParallelOptions { MaxDegreeOfParallelism = 10 },
    stuff => { YourMethod(stuff); }
);
Murray Foxcroft
  • 12,785
  • 7
  • 58
  • 86
2

I assume you're using parallelism to improve performance. If that's the case then first you need a baseline. Run the 1,000 queries in serial, creating a new connection each time (which in reality just pulls one from the pool).

Then try it with the same connection object and see if the performance improves.

Then try it with the came command object, just changing the parameter values.

Then try it in parallel with the same connection co you're not creating 1,000 connection objects, which you've already tried.

I would be surprised if you got a significant performance improvement by using parallelism, since Parallel improves the performance of CPU-bound tasks, and data queries are generally much more bound by I/O than CPU.

D Stanley
  • 149,601
  • 11
  • 178
  • 240