I have successfully loaded a very clean (plain English, no fancy symbols or images) CSV file into MarkLogic using MLCP (MarkLogic Content Pump) so that it would take the first row as the column names, and I've learned that when I try to load something that it not clean (i.e. mixed with other languages and encoding) it fails.
I read from the Ingestion guide (http://docs.marklogic.com/guide/ingestion/encoding?print=yes) that encoding is not controllable with MLCP so I decided to give the Java API and the xdmp Xquery a try.
When using the Java API and I am getting: Invalid UTF-8 escape sequence at line 1549 -- document is not UTF-8 encoded
If I try loading it with xdmp in with automatic encoding in Query Console or in a flow on Information Studio, it loads without a problem but MarkLogic does not take the first row as column names, but it rather takes in the entire file as one document, which is not what I am looking for.
Is there a way to load the CSV file without the encoding problem and have it take in the first row as column names?
Thanks in advance.