0

I'm dealing with a rather large text dataset (5.4 million short texts) and I'm trying to perform sentiment analysis con them on 16GB of ram.

I keep running out of memory whenever I try to build the language model:

data_lm = text_data_from_csv(DATASET_PATH, data_func=lm_data, chunksize=4000)
# Out of memory
data_clas = text_data_from_csv(DATASET_PATH, data_func=classifier_data, vocab=data_lm.train_ds.vocab, chunksize=500)

I've played around with the chunksize but the memory usage seems to keep rising over time and eventually results in a memory error.

Is there any way to work around this?

1 Answers1

0

Keep the chunksize below 100 and try to use GPU refer to this link to find more information fastai