3

Is someone able to help me understand how to get pandas-profiling working with a dataframe.

using this post (Unable to run Pandas Profiling on Databricks)i was able to replicate the output using a dictionary, but when using a dataframe, i get the following errors

enter image description here

i have installed all the libraries with no error, i can view the dataframe with no issues, is this something to do with the storage location? i have read/write access to this location.

teelove
  • 73
  • 5

1 Answers1

1

You can't run Pandas profiler directly on the Spark dataframe - you need to create a Pandas dataframe using the .toPandas() function (doc), like this:

profile = ProfileReport(df.toPandas(), title='EDA Report', explorative=True)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132