returning results from python script to variable in Jupyter notebook

Question

I have a python script that returns a pandas dataframe and I want to run the script in a Jupyter notebook and then save the results to a variable.

The data are in a file called data.csv and a shortened version of the dataframe.py file whose results I want to access in my Jupyter notebook is:

# dataframe.py
import pandas as pd
import sys

def return_dataframe(file):
    df = pd.read_csv(file)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1])

I tried running:

data = !python dataframe.py data.csv

in my Jupyter notebook but data does not contain the dataframe that dataframe.py is supposed to return.

mechanical_meat · Accepted Answer · 2020-04-29T03:26:46.253

1

This is how I did it:

# dataframe.py 
import pandas as pd
import sys

def return_dataframe(f): # don't shadow built-in `file`
    df = pd.read_csv(f)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1]).to_csv(sys.stdout,index=False)

Then in the notebook you need to convert an 'IPython.utils.text.SList' into a DataFrame as shown in the comments to this question: Convert SList to Dataframe:

data = !python3 dataframe.py data.csv
df = pd.DataFrame(data=data)[0].str.split(',',expand=True)

If the DataFrame is already going to be put into CSV format then you could simply do this in the notebook:

df = pd.read_csv('data.csv')

edited Apr 29 '20 at 03:26

answered Apr 29 '20 at 02:45

mechanical_meat

163,903
24
228
223

The real script is a rather long data wrangling and cleaning process, I just left in the bottom line which is returning the dataframe – tshwizz Apr 29 '20 at 03:08
Makes sense. I hope the answer helps your process. – mechanical_meat Apr 29 '20 at 03:10
Any advice on how to keep the original dtypes of the different columns – tshwizz Apr 29 '20 at 04:50
2

Instead of saving and reading from CSV you can use pickle: https://stackoverflow.com/a/51177054/42346 which will preserve dtypes. – mechanical_meat Apr 29 '20 at 04:55

returning results from python script to variable in Jupyter notebook

1 Answers1