This question is different than what I have found on stackoverflow due to the size of data, it is NOT duplicated.
We are using Cloudera.
I have seen solution for small xlsx files with only handful columns in header, in my case the csv file to be loaded into a new hive table has 618 columns.
Would it be saved as parquet by default if I upload it (save it to csv first) through HUE-> File Browser? if not, where can I specify the file format?
What would be the best way to create an external Impala table based on that location? It would definitely be unbelievable if I need to create the DDL/schema manually as there are so many columns.
Thank you very much.