5

I have a MEAN stack application that connects to customer databases and third-party data. From JS front end I need to be able to read parquet and big-data CSV files. In this regard please clarify my understanding :

  1. I cannot read parquet file using arrow libraries directly (due to this issue JIRA#2786). I have to use something like parquetjs-lite for this.
  2. To read big-data CSV into apache-arrow, I have to first use Python (pyarrow) to convert CSV to arrow format (as in here) and then read the arrow file in my JS application. a). If (2) above is correct then can I convert any third-party CSV to arrow or should I have a predefined schema ahead of time ? b). Are nulls and NaNs allowed in the CSV .

Thanks

user14013917
  • 149
  • 1
  • 10
  • Have you found anything re. point 2? – iflp Feb 04 '21 at 09:55
  • I went a different way so didn't get a chance to complete what I started out for, but I had followed Pauls' (Paul Taylor - who contributes to Apache Arrow) sample here - https://github.com/trxcllnt/csv-to-arrow-js . Hope it helps. regards – user14013917 Feb 05 '21 at 12:45

0 Answers0