I have 4 million records that needs daily load of data from source to target and we are doing truncate everyday. It takes like 9 hours as there are like 10 tables doing 4 million records data loading every day. Could you please tell me how do i improve the performance. Thanks
Asked
Active
Viewed 2,023 times
0
-
What RDBMS are you using? – Nikhil Oct 31 '16 at 18:19
-
My Source is Oracle and Destination is MySQL – Ujjwal Chowdary Oct 31 '16 at 18:32
-
I know this solution and has already implemented but no use and I even increased number of copies to 4. – Ujjwal Chowdary Nov 01 '16 at 18:51
-
Ok. Then I can't really think of any other way in Pentaho to make it faster. What about the bulk loader? – Nikhil Nov 01 '16 at 18:54
-
I tried MySQL bulk loader but i am getting some error in creating mkfifo file on the Linux server as we can only use MySQL bulk loader in Linux environment – Ujjwal Chowdary Nov 01 '16 at 19:49
-
Hello Nikhil, where do i add the settings useServerPrepStmts=false rewriteBatchedStatements=true useCompression=true inside the pentaho server for the MySQL Pentaho Admin Console to pick up the settings? – Ujjwal Chowdary Nov 09 '16 at 15:09
-
In the Database connections window, you will see "options" on the left side of the window. You can enter these parameters over there. – Nikhil Nov 09 '16 at 16:46
-
i want to make changes in the BI server, as we have database connections configured in the server. So Please tell me how to do it. Do i have to edit any JDBC file. If so what is the path for it. Thanks – Ujjwal Chowdary Nov 09 '16 at 18:14
-
I am not sure. See if this helps: https://help.pentaho.com/Documentation/5.1/0H0/060/010/030/020 – Nikhil Nov 09 '16 at 20:05
1 Answers
1
You have two options:
- Use the MY SQL bulk loader step available in PDI. It will definitely make the data loading faster.
Speed can be boosted by using some simple JDBC-connection setting.
useServerPrepStmts=false rewriteBatchedStatements=true useCompression=true
These options should be entered in PDI at the connection. Double click the connection go to Options and set these values.
Used together, useServerPrepStmts=false
and rewriteBatchedStatements=true
will “fake” batch inserts on the client. Specifically, the insert statements:
INSERT INTO t (c1,c2) VALUES ('One',1);
INSERT INTO t (c1,c2) VALUES ('Two',2);
INSERT INTO t (c1,c2) VALUES ('Three',3);
will be rewritten into:
`INSERT INTO t (c1,c2) VALUES ('One',1),('Two',2),('Three',3`);
The third option useCompression=true
compresses the traffic between the client and the MySQL server.
Finally you can increase the number of copies of the output step to 2 so that there are two threads inserting into the database.
Hope it helps!
Reference (Option 2): https://anonymousbi.wordpress.com

Nikhil
- 621
- 1
- 13
- 25