Need help to write a pyarrow table as parquet file in ADLS Gen2 account

Asked Jan 21 '23 at 07:23

Active Feb 01 '23 at 03:44

Viewed 136 times

I am struggling to write a pyarrow table as parquet file to ADLS Gen2 storage container. I m working in Azure Synapse Analytics using notebook.

Here is what I am able to do:

Mount ADLS Gen2 account to access files . Spark uses unique syntax to achieve this. Eg.

df = spark.read.load("synfs:/"+jobId+"/mnt/bronze/workday"+varFilepath
, format='csv',header=True) 
print(type(df))
df.show()

This works fine. I then convert it to pandas dataframe to do some manipulation. Now I want to write this as a parquet file.

df_csv=df.toPandas()
pq_tbl=pa.Table.from_pandas(df_csv)
print(type(pq_tbl))
pq.write_table(pq_tbl,"workday/example.parquet",filesystem= "synfs:/"+jobId+"/mnt/bronze" )

I get an error :Unrecognized filesystem type in URI: synfs:/7/mnt/bronze

edited Feb 01 '23 at 03:44

Rakesh Govindula

5,257
1
2
11

asked Jan 21 '23 at 07:23

Swati Vishwanathan

Need help to write a pyarrow table as parquet file in ADLS Gen2 account

0 Answers0