As @matszwecja pointed out in the comments, the most reasonable way is to collect them as you make them. It will also be clearest to yourself and others later. Plus more robust and easier to debug as you develop the code.
However, you seemed to be thinking more abstractly of iterating on the dataframes in kernel's namespace, and it is possible to do that and step through pickling the dataframes all automatically. It's just not that easy, perhaps. For example, you already found you cannot simply make a useable list using df_list = %who DataFrame
. (It shows the names in the output cell but not in a way Python can use.)
Here's an option that would work if you really did want to do it. This first part sets up some dummy dataframes and then makes a list of dictionaries of them:
import pandas as pd
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
input ='''
River_Level Rainfall
0.876 0.0
0.877 0.8
0.882 0.0
0.816 0.0
0.826 0.0
0.836 0.0
0.817 0.8
0.812 0.0
0.816 0.0
0.826 0.0
0.836 0.0
0.807 0.8
0.802 0.0
'''
df_name_one = pd.read_table(StringIO(input), header=0, index_col=None, delim_whitespace=True)
input ='''
River_Level Rainfall
0.976 0.1
0.977 0.5
0.982 0.0
0.916 0.3
0.926 0.0
0.996 9.0
0.917 0.8
0.912 0.0
0.916 0.0
0.926 0.1
0.836 0.0
0.907 0.6
0.902 0.0
'''
df_name_two = pd.read_table(StringIO(input), header=0, index_col=None, delim_whitespace=True)
list_of_dfs_dicts = []
for obj_name in dir():
obj_type_str = str((type(eval(obj_name))))
#print(obj_type_str)
if "DataFrame" in obj_type_str:
#print(obj_name)
#print(obj_type_str)
list_of_dfs_dicts.append({obj_name: eval(obj_name)})
Now each entry in the list is the name of the dataframe object and the dataframe. That can be iterated on and pickled via a single line in a notebook:
[df.to_pickle(f'{varname}.pkl') for d in list_of_dfs_dicts for varname,df in d.items()];
That actually equates to this, which is easier to read:
for d in list_of_dfs_dicts:
for varname,df in d.items():
df.to_pickle(f'{varname}.pkl')
For this self-contained answer, I listed the entire dataframe as part of the collected list and dictionary. Memory wasn't a concern here with these dataframes and I wanted it to illustrate things well in small steps.
However, memory was a concern of yours. You can just vary the collection step to not add the entire dataframe to the list, like so:
import pandas as pd
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
input ='''
River_Level Rainfall
0.876 0.0
0.877 0.8
0.882 0.0
0.816 0.0
0.826 0.0
0.836 0.0
0.817 0.8
0.812 0.0
0.816 0.0
0.826 0.0
0.836 0.0
0.807 0.8
0.802 0.0
'''
df_name_one = pd.read_table(StringIO(input), header=0, index_col=None, delim_whitespace=True)
input ='''
River_Level Rainfall
0.976 0.1
0.977 0.5
0.982 0.0
0.916 0.3
0.926 0.0
0.996 9.0
0.917 0.8
0.912 0.0
0.916 0.0
0.926 0.1
0.836 0.0
0.907 0.6
0.902 0.0
'''
df_name_two = pd.read_table(StringIO(input), header=0, index_col=None, delim_whitespace=True)
df_list = []
for obj_name in dir():
obj_type_str = str((type(eval(obj_name))))
if "DataFrame" in obj_type_str:
df_list.append(obj_name)
for df_name in df_list:
eval(df_name).to_pickle(f'{df_name}.pkl')
Bear in mind though eval()
is something to be careful using. In particular it opens the gate to code injection.
And by doing it this way, you aren't checking things. For example, while developing you could erroneously make a lot of dataframes at some point (example), and if those were still in your kernel's namespace, they'd ALL get pickled by the pickling step. That's why collecting what you want as you go along is more practical & safer/robust in the long run. I just thought your idea of using df_list = %who DataFrame
was intriguing.