Pandas pd.melt throwing memory error on unpivoting 3.5 GB csv while using 500GB ram

Question

Pandas pd.melt throwing memory error on unpivoting 3.5 GB csv while using 500GB ram. Is there any solution/function available to unpivot gigantic CSV files?. The current csv has more then 5000 columns.

I haven't tried this myself, so it's only a suggestion, but have you tried to divided the dataframe by rows to make it smaller and then concatenate at the end. You could do this easily in a loop. — run-out, Apr 26 '19 at 05:07
@run-out I was struggling with the same issue and tried your suggestion, iterating row by row and concatenating at the end indeed made the melt a lot faster. I will provide my solution below. — ldoe, Aug 30 '19 at 13:04

score 5 · Answer 1 · edited Jan 18 '20 at 17:52

I was struggling with the same issue and stumbled on your topic. Here is my implementation of @run-out suggestion (iterating by chunks and concatenating):

pivot_list = list()
chunk_size = 100000

for i in range(0,len(df_final),chunk_size):
    row_pivot =df_final.iloc[i:i+chunk_size].melt(id_vars=new_vars,value_vars=new_values)
    pivot_list.append(row_pivot)

df = pd.concat(pivot_list)

Very simple but this indeed made the melt a lot faster.

Pandas pd.melt throwing memory error on unpivoting 3.5 GB csv while using 500GB ram

1 Answers1

Linked