Ok, so, in scalding we can easily work with matrix, using matrix api, and it is ok - in a such way:
val matrix = Tsv(path, ('row, 'col, 'val))
.read
.toMatrix[Long,Long,Double]('row, 'col, 'val)
But how can I transform matrix to that format from format, like we usually write? Are there some elegant ways?
1 2 3
3 4 5
5 6 7
to
1 1 1
1 2 2
1 3 3
2 1 3
2 2 4
2 3 5
3 1 5
3 2 6
3 3 7
I need this to make operations on matrix with huge sizes, and I don't know the number of rows and columns (it is possible to give sizes if file? NxM for example).
I tried to make smth with TextLine( args("input") )
but i dunno how to count line number. I want to convert matrix on hadoop, mb there r other ways how to deal with format? Is it possible with scalding?