Is it possible to convert a breeze dense matrix to numpy array using spark?
I have here a breeze dense matrix I want to convert to numpy array.
Is it possible to convert a breeze dense matrix to numpy array using spark?
I have here a breeze dense matrix I want to convert to numpy array.
Here is a way that works correctly but is slow / inefficient (creates multiple copies). i used zeppelin spark and pyspark interpreters (i guess toree should also be possible):
in spark:
%spark
import breeze.linalg._
import breeze.numerics._
z.put("matrix", DenseMatrix.eye[Double](4));
z.get("matrix")
then in python:
%pyspark
import numpy as np
def breeze2numpy(breeze_matrix):
data = list(breeze_matrix.copy().data())
return np.array(data).reshape(breeze_matrix.rows(), breeze_matrix.cols(), order='F')
breeze2numpy(z.z.get("matrix"))
this works but will be impractical for big datasets (because of the multiple copies involved via a python list). it would be nice to have a zero-copy method using python's buffer protocol like there is for C++ Eigen matrix --> numpy array.