low_memory = True in pd.read_csv()

Question

low_memory = True --> parameter of pd.read_csv() in pandas loads the data in chunks in memory. But in the end we have a entire dataframe in the memory. However when we set low_memory = False, its load the entire dataset into memory which get read into dataframe at a time. Again the whole dataframe is in the memory. So how low_memory = True is saving the memory? What actually this low_memory is doing under the hood. I go through the pandas documentation but not getting anything from there..

score 1 · Answer 1 · answered Aug 29 '23 at 20:29

low_memory=True does not do anything anymore since it has been deprecated.

Main idea: setting low_memory=True saves memory during parsing by parsing the dataframe in chunks rather than all at once.

Details:

Looking at line 408 of the pandas github repo here, we see the following comment.

low_memory : bool, default True
    Internally process the file in chunks, resulting in lower memory use
    while parsing, but possibly mixed type inference.  To ensure no mixed
    types either set ``False``, or specify the type with the ``dtype`` parameter.
    Note that the entire file is read into a single :class:`~pandas.DataFrame`
    regardless, use the ``chunksize`` or ``iterator`` parameter to return the data in
    chunks. (Only valid with C parser).

Where it is written that low_memory = True is deprecated? – Ali Haider Aug 29 '23 at 20:57 — Ali Haider, Aug 29 '23 at 20:57

low_memory = True in pd.read_csv()

1 Answers1