Azure Apache Synapse Spark Pool Display Function Issue

Question

When I run the display function with Databricks I get the option to download the results as CSV. I am tring to do the same with Azure Synapse Spark Pools, however I getting the error:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-11-2d01e14> in <module>
      1 df = spark.sql("SELECT * FROM `*********")
----> 2 display(df)

~/cluster-env/env/lib/python3.8/site-packages/notebookutils/visualization/display.py in display(data, summary)
    197         log4jLogger \
    198             .error(f"display failed with error, language: python, error: {err}")
--> 199         raise err
    200 
    201     log4jLogger \

This should very simple.

Any thoughts?

Hi, community, I just need to know how to display the results 'df'. In databricks its simply display(df). Can someone let me know what the equivalent is in Azure Synapse Apache Spark please? — Patterson, Nov 07 '22 at 21:08
`display` also exists in azure synapse notebooks as well. You can refer to [this image](https://i.imgur.com/ZWAqDsD.png). Is the final requirement to download the resultant data? — Saideep Arikontham, Nov 08 '22 at 04:08
Hi Saideep, yes the final result is to download the resultant data. — Patterson, Nov 08 '22 at 08:43
Are you able to perform write operations to your ADLS account using spark spark notebooks? — Saideep Arikontham, Nov 08 '22 at 08:59
Hi Saideep, the data is located in our Lake Database, (see updated image in question) and I simply want to display the data using display(df), but I'm getting the error shown — Patterson, Nov 08 '22 at 09:40
I have tried reading from lake database and it works fine. Can you retry with another spark pool? — Saideep Arikontham, Nov 08 '22 at 10:25
Yes, reading in lake database worked for me. Maybe changing the spark pool configuration works? If it doesn't one possible alternative is to write it to storage account and download it from there. Since there would already be an adls account associated with azure synapse, you can write the dataframe to this adls and then download. — Saideep Arikontham, Nov 08 '22 at 17:08

score 1 · Accepted Answer · answered Dec 02 '22 at 04:53

I have tried using display on the dataframe where the data is being read from lake database and was able to get the desired requirement.

enter image description here

Create another spark pool and try to perform the display() operation again. If this does not work, as an alternative, write the data to your datal lake storage (associated with synapse workspace) and download it from there.

df.write.option("header",True).csv("abfss://<container>@<storage>.dfs.core.windows.net/output")

enter image description here

Azure Apache Synapse Spark Pool Display Function Issue

1 Answers1