1

I am interested to know if there is a way to find a normalized form for a MatrixStore using the ojAlgo matrix library.

Perhaps a routine or a task that once performed on a MatrixStore will cause each of the rows to have a mean of 0 and a standard deviation of 1.

If one is familiar with sklearn, what I'm looking for is some function on ojAlgo that functions similarly like the preprocessing module on sklearn.

INDRAJITH EKANAYAKE
  • 3,894
  • 11
  • 41
  • 63
YAMAZAKI1996
  • 35
  • 1
  • 10

1 Answers1

1

Not directly. You have to write some loops and calculations yourself. Here's one possible way to do it:

PrimitiveDenseStore matrix = ...;

SampleSet sampleSet = SampleSet.make();
for (int j = 0; j < matrix.countColumns(); j++) {
    sampleSet.swap(matrix.sliceColumn(j));
    for (int i = 0; i < matrix.countRows(); i++) {
        matrix.set(i, j, sampleSet.getStandardScore(i));
    }
}

With ojAlgo I strongly recommend organising data in columns.

I didn't actually test that code. Possibly there could be a problem to update the matrix in-place like this.

...

With v47.1.1 (just released) it is now possible to do it this way:

matrix.modifyAny(DataPreprocessors.STANDARD_SCORE);
apete
  • 1,250
  • 1
  • 10
  • 16
  • I notice some of the interface comes with looprow or loopcolumn methods that allows one to attach callbacks. In your opinion which is the recommended way to loop through a matrix when speed is a factor ? – YAMAZAKI1996 Mar 29 '19 at 09:16
  • 1
    I doubt it makes any difference in a case like this. If you're looking for that last bit of performance then go low-level, and that typically means you write the code/loops yourself. What does matter is how you organise your data. In the example above sliceRow() would be much slower than sliceColumn() when the matrices get a bit larger. – apete Mar 29 '19 at 10:05
  • thanks for your feedback ! I truly appreciate the work you put into ojAlgo. – YAMAZAKI1996 Mar 29 '19 at 10:57