4

Pandas pd.melt throwing memory error on unpivoting 3.5 GB csv while using 500GB ram. Is there any solution/function available to unpivot gigantic CSV files?. The current csv has more then 5000 columns.

  • I haven't tried this myself, so it's only a suggestion, but have you tried to divided the dataframe by rows to make it smaller and then concatenate at the end. You could do this easily in a loop. – run-out Apr 26 '19 at 05:07
  • 1
    @run-out I was struggling with the same issue and tried your suggestion, iterating row by row and concatenating at the end indeed made the melt a lot faster. I will provide my solution below. – ldoe Aug 30 '19 at 13:04

1 Answers1

5

I was struggling with the same issue and stumbled on your topic. Here is my implementation of @run-out suggestion (iterating by chunks and concatenating):

pivot_list = list()
chunk_size = 100000

for i in range(0,len(df_final),chunk_size):
    row_pivot =df_final.iloc[i:i+chunk_size].melt(id_vars=new_vars,value_vars=new_values)
    pivot_list.append(row_pivot)

df = pd.concat(pivot_list)

Very simple but this indeed made the melt a lot faster.

Community
  • 1
  • 1
ldoe
  • 330
  • 2
  • 9