I'm using sparse to construct, store, and read a large sparse matrix. I'd like to use Dask arrays to use its blocked algorithms features.
Here's a simplified version of what I'm trying to do:
file_path = './{}'.format('myfile.npz')
if os.path.isfile(file_path):
# Load file with sparse matrix
X_sparse = sparse.load_npz(file_path)
else:
# All matrix elements are initially equal to 0
coords, data = [], []
X_sparse = sparse.COO(coords, data, shape=(88506, 1440000))
# Create file for later retrieval
sparse.save_npz(file_path, X_sparse)
# Create Dask array from matrix to allow usage of blocked algorithms
X = da.from_array(X_sparse, chunks='auto').map_blocks(sparse.COO)
return X
Unfortunately, the code above throws the following error when trying to use compute()
with X
: Cannot convert a sparse array to dense automatically. To manually densify, use the todense method.
; but I cannot transform the sparse matrix to dense in memory, as it will result in an error.
Any ideas in how to accomplish this?