Databricks can create a data profiling report after using the display(dataframe_name)
.
I have created a data profiling report using Azure Databricks but I do not know how do I export it.
Can you please suggest How to export/download this report to my local system?
Asked
Active
Viewed 778 times
0

venus
- 1,188
- 9
- 18
-
Do you want to be able to use the data profile's information somewhere else? – Saideep Arikontham Nov 25 '22 at 06:44
-
I want to create a report whatever is available in dataprofile. As I have multiple dataframes so will collate all the information. – venus Nov 25 '22 at 06:48
-
There is no option provided to download this data profile to local machine directly, only option is to add it to dashboard. You can export the notebook as HTML file if required – Saideep Arikontham Nov 25 '22 at 07:15
-
but I want to download the report in tabular format only :( – venus Nov 25 '22 at 07:16
-
As far as I know, there might not be an option present to download the report in tabular format. These data profiles might be created as a tool to visualize/analyze data within the databricks workspace. – Saideep Arikontham Nov 25 '22 at 07:34
-
Thanks a lot for the help and the efforts. I really appreciate it – venus Nov 25 '22 at 08:11
-
1Will post it as an answer so it might help other community members. Would update the answer if I find any other alternative to download data profile in tabular format, I will update the answer. – Saideep Arikontham Nov 25 '22 at 08:19
1 Answers
0
There is no direct option to download the data profiling report from Azure Databricks to local machine in a tabular format.
Data profiling itself is a new feature that was introduced to reduce manual work that is needed to summarize the statistics of our dataframes.
And as specified in this official Microsoft documentation, we can only add the data profile to our dashboard.
There are also no other API's that can be used to download this data in tabular format.
As a possible workaround, it might be possible to complete this operation manually using pandas/ pandas on spark API to calculate all the required attributes.
In general, some of these stats can be directly obtained using
df.describe
as shown below. Heredf
is a pyspark dataframe:

Saideep Arikontham
- 5,558
- 2
- 3
- 11