0

I'm using Talend Open Studio for Big Data (7.3.1), and I write files from various sources to Cloudera Impala (Cloudera QuickStart 5.13) but that takes too much time and writes only ~3300 rows/s (take a look at the pictures).

csv to impala oracle xe to impala impala output settings

Is there way to raise writing to ~10000-100000 rows/s or even greater?
Am i using wrong approach for the load?
Or do i need to configure Impala/Talend better?
Any advice is welcome!

UPDATE
I install JDBC Impala driver: enter image description here

But OutputFile looks like it not configured for Impala: enter image description here

Error:
Exception in component tDBOutput_1 (db_2_impala) org.talend.components.api.exception.ComponentException: UNEXPECTED_EXCEPTION:{message=[Cloudera]ImpalaJDBCDriver ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: Impala does not support modifying a non-Kudu table: algebra_db.source_data_textfile_2 ), Query: DELETE FROM algebra_db.source_data_textfile_2.} at org.talend.components.jdbc.CommonUtils.newComponentException(CommonUtils.java:583)

pf_man
  • 55
  • 6
  • Just thinking, are you using impala odbc driver or hive? They can make huge perf difference. Use impala native odbc/jdbc to connect to impala. Also you can increase commit interval to 100k, but if read is slow, you need fix that bottle neck first. – Koushik Roy May 09 '21 at 17:43
  • Hi, I updated my question. I installed Impala JDBC, made new connection, but output file looks diffrent. There is no overwrite (only delete which is not supported in Impala), and that rise error... – pf_man May 10 '21 at 23:01
  • From log it is trying to delete from Impala table `DELETE FROM algebra_db.source_data_textfile_2`. Are you trying to delete? – Koushik Roy May 11 '21 at 02:38
  • I can't delete data from Impala (Impala doesn't support delete), clear data from table should be overwrite but JDBC output is set for like relational table, not for Impala or Hive table – pf_man May 11 '21 at 10:15
  • Whats the infa sess doing? Ins/ upd/ del? Remove other options from target load property. – Koushik Roy May 11 '21 at 11:00

0 Answers0