0

I'm calling another notebook that reads some data from files, and I want to pass the variables with the read data back to the calling notebook. Is this possible?

user626528
  • 13,999
  • 30
  • 78
  • 146

1 Answers1

1

Currently, it is not supported to get variables from the called notebook in Synapse.
We can call one notebook from another using the below command.

mssparkutils.notebook.run("notebook path", <timeoutSeconds>, <parameters>)

To exit from the called notebook we can use this command like return in normal functions.

mssparkutils.notebook.exit("value string")

This function only supports to return a string value to the calling notebook.
So, to get the file data variable from the called notebook we can have a temporary solution. That is using Temporary views.

Temporary views have a scope in the spark session of the calling notebook. When we call the notebook, it will execute in the spark session of the calling notebook, and we can access the temporary views which are created in callee as the views are created in the same spark session.

Here Notebook1 is the Caller and sample2 is the callee.

Code in Notebook1:

from notebookutils import mssparkutils 
returned_dfview = mssparkutils.notebook.run("/sample2") 
df2=spark.sql("select * from {0}".format(returned_dfview)) 
df2.show()

You can see that I am able to show my sample data which was read in callee using temporary view.

enter image description here

When you click on the View notebook run: sample2, you can see the executed code and output of the callee notebook.

sample2 code:

# read your files as dataframes 
df.createOrReplaceTempView("dftable") 
mssparkutils.notebook.exit("dftable")

We are getting the dataframe by returning the name of the temporary view in exit function.

enter image description here

If you want to get the data of multiple files, then create the temporary views for all and pass the names of them as an array. The exit() will return the array as a string to the Caller like below for example.

"['view1','view2','view3']"

Please follow below code to get the dataframes from the string.

exec(f'tempview_list = {source_views}') 
dfs={} 
for i in tempview_list: 
    df_name="{0}_df".format(i) 
    dfs[df_name]=spark.sql("select * from {0}".format(i))

Here source_views is the returned string of views. The above code converts the string to the array of views and stores the dataframes from it in a dictionary with the names like view1_df, view2_df.

enter image description here

If you don’t want to do it with temporary views as they only have a scope in their spark session you can try with global views in synapse like this Microsoft Documentation.

Txt files can be passed as string and for JSON files you can try code in the above documentation in Synapse.

NOTE: Make sure you publish the callee notebook before executing the code in caller notebook.

Rakesh Govindula
  • 5,257
  • 1
  • 2
  • 11