0

I am writing a table in Hadoop to Snowflake using the Spark Connector, but my file in Hadoop is in Apache Parquet.

While writing in Snowflake it is writing in csv format, and it applies the default delimiter "," but I need to apply "^" as the delimiter. How can I do that?

df.write.format(SNOWFLAKE_SOURCE_NAME).options(sfOptions).option("dbtable", "t24122018").mode(SaveMode.Overwrite).save

skomisa
  • 16,436
  • 7
  • 61
  • 102
sivaraj
  • 49
  • 1
  • 5
  • Can you please elaborate what is the issue you're facing and why would you need this ? AFAIK the data written by the Snowflake to S3 is only consumed by the Snowflake queries, so not sure why the format matters fro you. – Marcin Zukowski Dec 26 '18 at 00:49
  • i am migrating one of the table in hadoop to snowflake ..That table in hadoop is in parquet format.so i need it in parquet format in snowflake as well and i am directly writing into snowflake i am not using s3..thats the scala code for writing the table into snowflake directly – sivaraj Dec 27 '18 at 06:04
  • I'm sorry, I don't understand. Snowflake never stores data internally in Parquet. And you almost always use S3 to communicate with Snowflake if you use the Snowflake connector (on AWS). Please post a more detailed description of what you're trying to achieve exactly (please use more than 2 sentences), otherwise your question is impossible to answer. – Marcin Zukowski Dec 29 '18 at 17:07
  • Hi Thanks for your reply.. we can access the data in snowflake from S3 or snowflake database directly as well. My data in hadoop is parquert format.I moved my table using spark connector to snowflake db directly.yes you are right defaultly it is writing as csv format. I need a syntax to write as parquet format in snowflake,if you help me with the syntax it will be useful Note: S3 is one of the storage snowflake uses.As for POC we are not using s3 we are writing directly into snowflake database – sivaraj Dec 31 '18 at 05:24
  • Just to clarify - so do you want to export data that is stored inside Snowflake already to parquet ? Sorry, this is still not clear. Please ask someone familiar with the problem to review/improve your post, as it is really not phrased in a way where others can help (as you can tell, since nobody answered). Please specify EXACTLY what is it that you are trying to achieve. Where is the input data. Where do you want it to be saved, etc. Also, please review https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-location.html - maybe it helps. PARQUET is supported there. – Marcin Zukowski Jan 02 '19 at 05:55
  • @MarcinZukowski sorry for that phrase mistake hope this info will be help **Objective**:Move hadoop table which is in parquet format to snowflake area in parquet format **Source**: Hadoop Format: Praquet **Target**:Snowflake format:parquet **This below code:** df.write.format(SNOWFLAKE_SOURCE_NAME).options(sfOptions).option("dbtable", "t24122018").mode(SaveMode.Overwrite).save ....pulls the data from hadoop and write it into snowflake in csv format but i need in parquet format. if you help me with syntax for parquet format it will be helpful – sivaraj Jan 03 '19 at 07:21
  • Again, there is no such thing as "write into snowflake in csv format" - Snowflake stores neither CSV nor Parquet internally. Do you mean move it to the Snowflake `stage`? Sorry, what you write is still not clear. Again, maybe another pair of eyes from your side make your question clearer. – Marcin Zukowski Jan 05 '19 at 22:32
  • it is very tough write here in proper format as it is not supporting normal writing ....Answering to your question: yes it is moving to snowflake stage.. i think it stores default in csv format because i checked after moving the table and i checked the file format of the table it shows csv with the delimter ",".... will you be able to give me the syntax for parquet file storage . – sivaraj Jan 07 '19 at 02:34

0 Answers0