I am using Pentaho data integration (PDI)-spoon to create ETL's and I am very focused on performance. I develop an ETL to process that copy of 2,500,000 rows (each row has 104 columns) from MySQL 8 to Clickhouse database and it takes 30 min. Destination table does not have any indexes and constraints and it is a columnar database.
I am using linux ubuntu 22.04 and transformation running on pentaho server through spoon.sh
How to increase the transformation input/output speed?
I am using only 4 steps:- Truncate table by using EXECUTE SQL SCRIPT --> Fetch data by using TABLE INPUT--> Changing date formats by using SELECT VALUES ---> insert data into destination table by using TABLE OUTPUT.
I want to increase the I/O speed of the PDI-Spoon transformation