I have a relatively large NumPy array (nearly 300k rows and 20+ columns, though most values are 0) for which I need to compute a distance matrix using scikit-learn's pairwise_distances function.
Unfortunately, this process runs into a memory error unless I convert the input array to a sparse matrix. SciPy offers many sparse matrix classes and I do not know which one is best for this particular situation.
I found an SO answer that favors CSR or CSC, but I am unclear which one would be best to compute a distance matrix. Any suggestions are welcome!