0

I want to import multiple csv files into Exasol db. Actually it is one huge file, and I already chunked it to massivly improve import performance. Exasol supports parallel import of multiple files:

IMPORT INTO target_table
FROM CSV AT 'https://someurl'
FILE 'file1.csv'
FILE 'file2.csv'
...
;

The problem is, that I want to ignore import errors and log them in some way. I would love to use an error table, which Exasol supports, but unfortunately not for multiple files in a single statement

IMPORT INTO target_table
FROM CSV AT 'https://someurl'
FILE 'file1.csv'
FILE 'file2.csv'
...
REJECT LIMIT 100 --ignore 99 errors, this does completes the import, but nothing is logged
ERRORS INTO IMPORT_ERROR_TABLE --does not work for mutiple file import statement
;

I could just not chunk my csv file and everything would work, but I'd rather not do that, because performance. ;-)

Any suggestions on what to do? How would you check for errors during import of multiple files at once? I'm open for suggestions.

phgie
  • 23
  • 3

1 Answers1

0

You may try to send errors into FILE instead of TABLE. This option might work with multiple files.

Alternatively, there is a long way. You may create an UDF script and use it to read and parse all files in parallel. For Java you may use Univocity CSV parser. You'll be able to fine-tune logging and transformations, and have an incredible flexibility overall. But it requires coding.

I can share some code with you if UDF script is an acceptable option.

wildraid
  • 126
  • 4