1

I had a simple perl scrip that runs some DB opertions on using DBD::Oracle package and also forks some processes. Child processes also connects with DB but creates there own DBH object. This was working fine until I migrated from Linux RH4 to Linux RH6. With OS migration perl version also changed from 5.8 to 5.10. On migration I face error as foll following:

DBD::Oracle::db DESTROY failed: ORA-03135: connection lost contact
Process ID: 1984105
Session ID: 1511 Serial number: 32085 (DBD ERROR: OCITransRollback) during global      destruction.

My perl script:

$dbh = DBI->connect("DBI:Oracle:$server", "$username", "$password") or die "Couldn't connect to database: " . DBI->errstr;

$dbh->{AutoCommit} = 0;

load_metrics(); # here I do some DB operations using $dbh

##Create ForkManager instance with number of child processes equal to batches
my $pm = new Parallel::ForkManager(15);

for ($proc_num = 100; $proc_num >= 1; $proc_num--)
{
    $pm->start($proc_num) and next;

    printf localtime(). " [$PID] Start sampling for Metric\n";
    system("./sample_impl.pl");
    printf localtime()." [$PID] Finished Resampling Metric\n";
    $pm->finish;
}

printf localtime(). " [$PID] Parent process waiting for Child processes\n";

$pm->wait_all_children;

print localtime()." "."sampling Finished\n";

The above error gets repeated 100 times.

If I remove ForkManager scrpts works just fine. I am not sharing DBH in Parent with Child, hence I am bit surprised to see this error.

Please advice on reason of this error and how to get this fixed.

user1545583
  • 69
  • 1
  • 6
  • Did you also update/recompile the DBD::Oracle package? – choroba Oct 08 '14 at 12:07
  • So theoretically you could close the db connection after load_metrics? Probably not, what else are you doing with that connection? Maybe, since you fork, it somehow thinks, that OCI lib was already initialized somehow. When you fork, you also share lib's internal structures with your child. – ibre5041 Oct 08 '14 at 12:16
  • The error was possibly silent before, but your code structure is broken. – ikegami Oct 08 '14 at 16:53
  • I am doing nothing after load_metrics on connection. Today, I added $dbh->disconnect; at the bottom of code (Ideally it should not be required to use it explicitly, should it?) and ran, but no luck I still get the same error. – user1545583 Oct 09 '14 at 03:22
  • Then I added $dbh->disconnect; just after load_metrics(), and surprisingly it worked. Now, I am more confused with 2 doubts 1) Why must I have to call $dbh->disconnect to remove error after migrating to RH6 server? 2) Why $dbh->disconnect didn't work if called at the bottom of code? – user1545583 Oct 09 '14 at 03:26
  • @user1545583 it is not related to RH6. The root cause is either newer version of DBD::Oracle or version of libclntsh. On background your code calls OCIEnvAlloc in the OCI lib and then creates the connection. This env handle is then shared between parent and all the children. And this somehow causes the trouble. You're right that you do NOT share $dbh connection handle between parent and child, but maybe DBD::Oracle or libclntsh share some structures. – ibre5041 Oct 09 '14 at 10:23

0 Answers0