I get a MemoryError: Unable to allocate 61.4 GiB for an array with shape (50000, 164921) and data type float64:
tfidf = TfidfVectorizer(analyzer=remove_stopwords)
X = tfidf.fit_transform(df['lemmatize'])
print(X.shape)
Output : (50000, 164921)
Now,here comes the memory error
df = pd.DataFrame(X.toarray(), columns=tfidf.get_feature_names())
MemoryError: Unable to allocate 61.4 GiB for an array with shape (50000, 164921) and data type float64