2

Background

We have a C#/VB.net client application consuming a WCF service which connects to an Oracle database. The web service connects to the database using the .NET framework's data provider for Oracle (not to be confused with ODP). Our testers have experienced sporadic Oracle account locking which seems to happen shortly after changing the user's Oracle password. The dba_audit_trail logs have revealed what seem to be automated connection attempts at very regular intervals (i.e. every two minutes on the dot) without any user or client activity. Numerous logs (IIS, WCF tracing, message logging, etc.) have confirmed these connection attempts are not initiated directly by the client application; they must be coming independently from the web service or from inside the System.Data.OracleClient library. The automatic attempts continue forever until the web service's worker process (single worker) dies from inactivity.

In some instances, these automatic attempts get started before the password change, and they connect successfully to the database, but as soon as the password changes, the next attempt fails for invalid username/password. After three attempts, the account gets locked. We are trying to find the source of these periodic connection attempts.

I found a very similar, but unanswered, problem on Oracle's forum here.

Current Thoughts

Our latest investigations have led us to believe it is an unexpected behavior from connection pooling. If the user connected to the web service before the password change, a connection pool would be created for the original connection string. After changing the password and logging back into the web service, the data provider would create a new connection pool based on the new connection string.

Could something inside the data provider be trying to keep the old connection alive from the first connection pool? Perhaps the first connection pool is discarding the old connection and attempting to replenish it with a new one (with the now invalid connection string). What could cause this? Note: we are using the default settings for min/max pool size (0/100).

We do not believe our code is directly attempting to access a connection from the first connection pool. The user's session does not have any memory of the previous session's password, and therefore would not be using the old connection string to reference the first connection pool. Additionally, nothing in our code would explain the very precise connection intervals we are seeing.

T. Yates
  • 51
  • 1
  • 9

3 Answers3

3

The underlying problem ended up being unreleased database connections. When a connection is opened, it gets checked out of the connection pool. If the connection is never closed, the pool thinks it is still being used. This causes the pool management logic to periodically re-authenticate with the database using the original connection string. When the password changes, this quickly leads to failed login attempts and account locking.

// Problem logic; connection is never closed/returned to the connection pool.
public static void ConnPoolTest1()
{
    OracleConnection conn = new OracleConnection(connectionStringWithPooling);
    conn.Open();

    //...Do some work

    // Sit on this line for 5-10 minutes and examine Oracle's dba_audit_trail.
    Console.ReadKey(); // Since connection was never released back to the connection pool, the
                       // data provider's pool management will regularly re-authenticate with DB.
                       // If user's password changes before this process dies (releasing the
                       // connection pools), you start accumulating failed password attempts.
}

The proper fix for this problem is to ensure connections are always returned to the pool when you are done with them!

// Best practice: ALWAYS CLOSE YOUR CONNECTIONS WHEN YOU ARE DONE!
public static void ConnPoolTest2()
{
    OracleConnection conn = new OracleConnection(connectionStringWithPooling);
    conn.Open();

    //...Do some work

    conn.Close();

    // Sit on this line for 5-10 minutes and examine Oracle's dba_audit_trail.
    Console.ReadKey(); // No problem here! No recurring authentication attempts because the
                       // connection has been returned to the pool.
}

NOTE: Other answers suggested turning off pooling and clearing old connection pools when the password changed. These suggestions worked for us as a temporary patch while we searched for the unreleased resources, and they greatly helped us to isolate the problem.

T. Yates
  • 51
  • 1
  • 9
  • +1 for answering your own question, but actually giving an answer – iMortalitySX Oct 16 '12 at 13:11
  • 2
    Other best practice. Anything that implements IDisposable goes into a using block. Up voted since I've seen other similar problems unanswered on the web. – iivel Oct 16 '12 at 23:43
2

This may help a little.

Oracle ODP.Net and connection pooling

and

OLE DB, ODBC, and Oracle Connection Pooling

Basically, in the second webpage there, MSDN states "Once created, connection pools are not destroyed until the active process ends.". It seems that your web service may be holding on to so many connections/connection pools that it is having some issues.

So my suggestions: Besides doing a little more troubleshooting with maybe some added logging of connections (maybe just a text file), or that first link has a nice command to track the connections to the database, I would try and turn off the connection pooling for now. The issue you seem to be having is called "pool fragmentation". This is where you have heavy traffic to a single computer that is piping all the database connection through connection pools. There end up being so many pools that memory issues start to occur and connections are not closed properly. That second one would be your issue, if the connection is not closed, or lets say your password change command executes before a list of other commands that used the old connection pool, you are going to have issues.

Ultimately, in your situation, you would have a single point (web service) that created its own web connection pools (not for the users) and served the data back to the users through its own connection. That would mean that there would have to be different type of authentication, handled by the web service side, to deal with the users connecting. I'm sure that would probably be too much change for now in your model, but I would highly suggest looking into finding that solution in the end.

Community
  • 1
  • 1
iMortalitySX
  • 1,478
  • 1
  • 9
  • 23
  • I agree we are ending up with many pools, especially when the testers repeatedly exercise the change password functionality, but I am still having trouble seeing what is causing the additional connections even after the user terminates his session. There are no queued commands waiting to execute (i.e. no apparent race conditions); the password change happens well after the first session is dead. I tried building a test driver to create excessive pools and connections, but I have not been able to reproduce the recurring automatic connections. The problem is still very unpredictable. – T. Yates Oct 14 '12 at 17:41
  • "Pool fragmentation is a common problem in many Web applications where the application can create a large number of pools that are not freed until the process exits." I lean toward believing this is your issue because each time you change the password, you are creating another pool for the same user. ADO (which is what the System.Net.OracleClient uses in the background) is notorious for leaving connections open even after you call close because they are in the pool. The problem is the connection is being "kept alive" even though you don't want it to, so that "dead" connection pings the DB. – iMortalitySX Oct 15 '12 at 01:31
  • Thanks for the guidance. We disabled pooling like you suggested and the problem went away. It helped us to identify the actual cause (unreleased database resources), posted as a separate answer. – T. Yates Oct 15 '12 at 21:54
2

Whenever any event that would invalidate a connection occurs you need to destroy the pool so that any leaked connections and/or keepalive for the pool is appropriately marked to prevent re-use. For this you'll want to use the clearpool or clearallpools method of the data provider.

http://msdn.microsoft.com/en-us/library/system.data.oracleclient.oracleconnection.clearpool.aspx

Additionally, your global exception manager can listen for the exception thrown for an invalid user and destroy that pool by enumerating the connections for the user identified as part of the connection string. Probably not efficient, but should get the job done.

iivel
  • 2,576
  • 22
  • 19
  • We discussed this, and I believe clearing out old pools should solve the problem, but we cannot regularly reproduce it yet. I would like have enough understanding of the problem to reproduce it and prove that clearing the pools stops the issue. Sometimes the accounts can go a day or two before being locked. Other times, the accounts can lock multiple times in a single day. – T. Yates Oct 15 '12 at 12:07
  • You can disable connection pooling, or overload the dispose the connection to determine unequivocally if that's the issue. – iivel Oct 15 '12 at 13:26
  • Thank you for the suggestion! We ended up clearing the pools as a temporary fix while we searched for the source of the problem (unreleased database connections), posted as a separate answer. This mitigated the problem as a short-term solution. – T. Yates Oct 15 '12 at 21:52