0

When I run the display function with Databricks I get the option to download the results as CSV. I am tring to do the same with Azure Synapse Spark Pools, however I getting the error:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-11-2d01e14> in <module>
      1 df = spark.sql("SELECT * FROM `*********")
----> 2 display(df)

~/cluster-env/env/lib/python3.8/site-packages/notebookutils/visualization/display.py in display(data, summary)
    197         log4jLogger \
    198             .error(f"display failed with error, language: python, error: {err}")
--> 199         raise err
    200 
    201     log4jLogger \

This should very simple.

enter image description here

Any thoughts?

Patterson
  • 1,927
  • 1
  • 19
  • 56
  • Hi, community, I just need to know how to display the results 'df'. In databricks its simply display(df). Can someone let me know what the equivalent is in Azure Synapse Apache Spark please? – Patterson Nov 07 '22 at 21:08
  • `display` also exists in azure synapse notebooks as well. You can refer to [this image](https://i.imgur.com/ZWAqDsD.png). Is the final requirement to download the resultant data? – Saideep Arikontham Nov 08 '22 at 04:08
  • Hi Saideep, yes the final result is to download the resultant data. – Patterson Nov 08 '22 at 08:43
  • Are you able to perform write operations to your ADLS account using spark spark notebooks? – Saideep Arikontham Nov 08 '22 at 08:59
  • Hi Saideep, the data is located in our Lake Database, (see updated image in question) and I simply want to display the data using display(df), but I'm getting the error shown – Patterson Nov 08 '22 at 09:40
  • I have tried reading from lake database and it works fine. Can you retry with another spark pool? – Saideep Arikontham Nov 08 '22 at 10:25
  • Hi did you read the lake database? – Patterson Nov 08 '22 at 16:52
  • Yes, reading in lake database worked for me. Maybe changing the spark pool configuration works? If it doesn't one possible alternative is to write it to storage account and download it from there. Since there would already be an adls account associated with azure synapse, you can write the dataframe to this adls and then download. – Saideep Arikontham Nov 08 '22 at 17:08

1 Answers1

1
  • I have tried using display on the dataframe where the data is being read from lake database and was able to get the desired requirement.

enter image description here

  • Create another spark pool and try to perform the display() operation again. If this does not work, as an alternative, write the data to your datal lake storage (associated with synapse workspace) and download it from there.
df.write.option("header",True).csv("abfss://<container>@<storage>.dfs.core.windows.net/output")

enter image description here

Saideep Arikontham
  • 5,558
  • 2
  • 3
  • 11