Saving vectorized and especially compressed (sparse) data in a TXT/CSV file is not the best approach as you might have problems when reading it back - you will lose dtypes, compression/"sparseness", etc.. You may even encounter cases when you will not be able to read your TXT/CSV file in memory.
Here you can see an example when converting sparse matrix to a normal (numpy) one ends with MemoryError
. It may happen to you if you will save your sparse (compressed) matrix to CSV and then will try to read it back (uncompressed).
So i would recommend you to use pickling:
saving / serializing your data:
from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl')
where clf
is your trained model or another sparse/compressed data structure
reading it back from disk:
from sklearn.externals import joblib
clf = joblib.load('filename.pkl')