I have a few dataframes stored inside a dict called my_dict. The keys of the dict are stored inside a list called filter_list.
filter_list = ["A", "B", "C", ...]
my_dict[A] gives me the following result:
links A
0 Q11@8.jpg 1
1 Q11@11.jpg 1
2 Q11@4.2.jpg 1
3 Q11@4.3.jpg 1
my_dict[B] gives me the following result:
links B
0 Q11@8.jpg 1
1 A11@21.jpg 1
2 Q11@42.jpg 1
3 C11@4.jpg 1
and so on...
Now I want to merge all the dataframes together. I am using an outer-join logic since I want my final dataframe to include all possible links that are present across all dataframes inside the "links" column.
As such, I use a loop to merge them iteratively but I keep getting an error message telling me
MemoryError:
with no further info. In order to release RAM during my loop I am saving the results to a pickle file, but this doesn't seem to help either. Still I get the same error.
This is the code I am using:
for index in tqdm(range(2,len(filter_list))):
try:
result = pd.read_pickle("result.pkl")
except:
pass
if index == 2:
result = pd.merge(my_data[filter_list[0]], my_data[filter_list[1]], on="links", how="outer")
result = pd.merge(result , my_data[filter_list[index]], on="links", how="outer")
result.fillna(0, inplace=True)
result[result.columns[1:]] = result[result.columns[1:]].astype(int)
result.to_pickle("result.pkl")
del result