0

I want to create a 2D matrix in python when number of rows and columns are equal and it is around 231000. Most of the cell entries would be zero. Some [i][j] entries would be non-zero.

The reason for creating this matrix is to apply SVD and get [U S V] matrices with rank of say 30.

Can anyone provide me with the idea how to implement this by applying proper libraries. I tried pandas Dataframe but it shows Memory error.

I have also seen scipy.sparse matrix but couldn't figure out how it would be applied to find SVD.

1 Answers1

1

I think this is a duplicate question, but I'll answer this anyways.

There are several libraries in python aimed at dealing with partial svds on very sparse matrices.

My personal preference is scipy.sparse.linalg.svds, a ARPACK implementation of iterative partial SVD calculation.

You can also try the function sparsesvd.sparsesvd, which uses the SVDLIBC implementation, or scipy.sparse.linalg.svd, which uses the LAPACK implementation.

To convert your table to a format that these algorithms use, you will need to import scipy.sparse, which allows you to use the csc_matrix class

Use the above links to help you out. There are a lot of resources already here on stack overflow and many more on the internet.

Rushabh Mehta
  • 1,529
  • 1
  • 13
  • 29