Csv to Pandas Dataframe memory error

Asked Feb 16 '17 at 08:39

Active Feb 16 '17 at 08:39

Viewed 913 times

I am trying to get a 4Gb Csv into a dataframe. I got the memory error. I checked the chunksize solution but still have the problem. I have also heard that it could be related to a python version but I am running with anaconda python 3.6 64 bits (at least it is what it is supposed to be)

Here is my function code :

import panda as pd    
def dataFrameCreationCsv(path):
        print('Loading Csv file...')
        if file.isfile(path):
            #  dataFrameCsv=pd.read_csv(path,encoding='latin1',delimiter='|')  MEMORY ISSUES ON LARGE SCALE DATASET
            tp = pd.read_csv(path, encoding='latin1', delimiter='|', iterator=True, chunksize=1000)  # gives TextFileReader, which is iterable with chunks of 1000 rows.
            dfCsv = pd.concat(tp, ignore_index=True)
            return dfCsv
        else:
            print('wrong path used')

asked Feb 16 '17 at 08:39

Mayeul sgc

1,964
3
20
35

what is your processor architecture? also how much ram do you have? – Shubham Namdeo Feb 16 '17 at 08:48
I am running on a windows VM with 12Gb Ram, should switch to 32Gb soon, but this not my biggest Csv... – Mayeul sgc Feb 16 '17 at 08:50
Have a look at this: http://stackoverflow.com/questions/17557074/memory-error-when-using-pandas-read-csv – Shubham Namdeo Feb 16 '17 at 08:50
but how let it be a general function if i have to enter all field types (I have a lot ) to get it run – Mayeul sgc Feb 16 '17 at 08:52

Csv to Pandas Dataframe memory error

0 Answers0