Pandas pd.melt throwing memory error on unpivoting 3.5 GB csv while using 500GB ram. Is there any solution/function available to unpivot gigantic CSV files?. The current csv has more then 5000 columns.
Asked
Active
Viewed 1,492 times
4
-
I haven't tried this myself, so it's only a suggestion, but have you tried to divided the dataframe by rows to make it smaller and then concatenate at the end. You could do this easily in a loop. – run-out Apr 26 '19 at 05:07
-
1@run-out I was struggling with the same issue and tried your suggestion, iterating row by row and concatenating at the end indeed made the melt a lot faster. I will provide my solution below. – ldoe Aug 30 '19 at 13:04
1 Answers
5
I was struggling with the same issue and stumbled on your topic. Here is my implementation of @run-out suggestion (iterating by chunks and concatenating):
pivot_list = list()
chunk_size = 100000
for i in range(0,len(df_final),chunk_size):
row_pivot =df_final.iloc[i:i+chunk_size].melt(id_vars=new_vars,value_vars=new_values)
pivot_list.append(row_pivot)
df = pd.concat(pivot_list)
Very simple but this indeed made the melt a lot faster.