0

I am struggling to write a pyarrow table as parquet file to ADLS Gen2 storage container. I m working in Azure Synapse Analytics using notebook.

Here is what I am able to do:

  1. Mount ADLS Gen2 account to access files . Spark uses unique syntax to achieve this. Eg.
df = spark.read.load("synfs:/"+jobId+"/mnt/bronze/workday"+varFilepath
, format='csv',header=True) 
print(type(df))
df.show()

This works fine. I then convert it to pandas dataframe to do some manipulation. Now I want to write this as a parquet file.

df_csv=df.toPandas()
pq_tbl=pa.Table.from_pandas(df_csv)
print(type(pq_tbl))
pq.write_table(pq_tbl,"workday/example.parquet",filesystem= "synfs:/"+jobId+"/mnt/bronze" )

I get an error :Unrecognized filesystem type in URI: synfs:/7/mnt/bronze

Rakesh Govindula
  • 5,257
  • 1
  • 2
  • 11

0 Answers0