0

I have created an ETL tool in Perl. There are three database servers to which ETL tool is communicating, say dbserver1 (OLTP server - Windows Box), dbserver2 (staging server - linux Box), dbserver3 (OLAP Server, linux Box). My ETL script is on dbserver 2.

The scripts reads the data from dbserver1 and bring it into the dbserver2 for some trnsformations, performs transformations and then put the data into the dbserver3. For achieving this the script creates some OUTFILE data on dbserver2. So there are two type of OUTFILE queries:

  1. OUTFILE query which runs on the dbserver1, creates .data on dbserver2 and
  2. OUTFILE query which runs on the dbserver2 creates .data file on dbserver2.

The second query works fine as it is creating a file on the same server. But first type of query gives me following error:

DBD::mysql::st execute failed: Can't create/write to file '\home\dbserver2\dumpfile.2011-11-04-03:02.data' (Errcode: 2) at stagingtransform.pl line 223, <> line 8.

I guess this is related to some user permissions. And if I am not wrong then, MySQL on dbserver2 is having permissions to read/write to the dbserver2, but MySQL on dbserver1 is not.

Can it be because of the dbserver1 is Windows and dbserver2 is Linux box?

How can I resolve this?

FYI: The file formate is: dumpfile.yyy-mm-dd-hh:mm.data and I also have set the AppArmor settings for MySQL on dbserver2, which is for MySQL on dbserver2.

Rahul Shelke
  • 2,052
  • 4
  • 28
  • 50

1 Answers1

1

The problem is the outfile query on dbserver1 can only write locally, so you need a different approach.
One very easy method is to use mysqldump (on dbserver2) to connect to dbserver1 and pipe the output to a mysql client which injects the SQL into dbserver2.

On the other hand, if you want to use DBI:

my $source_sql = q{SELECT ...};
my $target_sql = q{INSERT ... VALUES (?, ?, ...)};
my $source = $source_dbh->prepare($source_sql);
my $target = $target_dbh->prepare($target_sql);
$source->execute;
my $qty = $target->execute_array({ArrayTupleFetch => $source});

For large transfers of data, the mysqldump approach is faster.

niczero
  • 367
  • 1
  • 7
  • Thank you niczero. data is huge as it has around 20 millions of records and it will be really slower If I use the above approach, I guess... Also mysqldump is going to take time to import from dbserver1 to dbserver2. And It will tedious to have mysqldump in between ETL tools operations. No issue. I will try this. – Rahul Shelke Nov 05 '11 at 12:42
  • one more thing, the dbserver2 has OUTFILE query whose data ETL will write to dbserver3 using INFILE. Does this going to make any issue regarding permissions, like this one for OUTFILE? – Rahul Shelke Nov 05 '11 at 12:47
  • Yes, exactly the same issue. Another approach would be to do your reading/writing locally using your first technique, but wrapped in a script that then transfers the files to the next server. – niczero Nov 08 '11 at 10:39
  • Thanks niczero. I used system (mysql -e "query") and that solve my problem. Now I am able to import data from dbserver1 to dbserver2 without modifying my ETL much. Also I used LOCAL in queries which import data from dbserver2 to dbserver3, as my ETL tool is on dbserver2 and acting as a client. So it worked. – Rahul Shelke Nov 10 '11 at 13:21