2

I need to find the best fitting regression line for a set of points. For example for this matrix:

int b [][] = {      { 3, 1, 0, 0, 0, 0, 0, 0, 0 },
                    { 1, 2, 3, 1, 0, 1, 0, 0, 0 },
                    { 0, 1, 2, 1, 0, 0, 0, 0, 0 },
                    { 0, 0, 0, 3, 0, 0, 0, 0, 0 }, 
                    { 0, 0, 0, 0, 0, 0, 0, 0, 0 }, 
                    { 0, 0, 0, 0, 0, 1, 3, 0, 0 }, 
                    { 0, 0, 0, 0, 0, 1, 2, 3, 1 }, 
                    { 0, 0, 0, 0, 0, 1, 1, 1, 2 }, 
                    { 0, 0, 0, 0, 0, 0, 0, 0, 1 }   }; 

Every number represents the amount of data points (weight I suppose) at that location (where rows are the X axis and Columns are for the Y). I have attempted to use the SimpleRegression class from the apache mathematics library and am having some issues. First, it doesn't appear to support weights. Second I believe that I'm doing something wrong, even for a matrix that is nothing but 1's on the main diagonal the slope/intercept results make no sense.

public static void main(String[] args) {

        double a[][] = new double[9][9];
        for (int i = 0; i < 9; i++)
            a[i][i] = 1;


        SimpleRegression r = new SimpleRegression(true);

        r.addData(a);

        System.out.println("Slope = " + r.getSlope());
        System.out.println("Intercept = " + r.getIntercept());

}

This gives me results that are incorrect. I would assume that this matrix represents the function f(x) = x yet the slope I'm getting is -0.12499..

Could anyone point me at what I'm doing wrong? I have a feeling I'm not only misusing the code but also the mathematics.

duplode
  • 33,731
  • 7
  • 79
  • 150
TheFooBarWay
  • 594
  • 1
  • 7
  • 17
  • addData([][]) expects a 2xN matrix I think so why dont you try addData(x,y) individually? – gpasch Feb 26 '16 at 21:23
  • and think yu should give the coordinates (i, j) not the values - again not a user of SimpleRegression just by reading – gpasch Feb 26 '16 at 21:25
  • Thank you, it seems to be working more along the line sI would expect now, I must have misunderstood the way the method works. Reading about linear regression on wikipedia got me thinking along the lines of full matrices.. It also appears that adding the same coordinate more than once increase it's weight. Useful to know. I hesitate to call this solved till I test a bit more tomorrow morning but this is good progress. Thank you. – TheFooBarWay Feb 26 '16 at 21:53

1 Answers1

1

As the comments say, addData() expects a 2xN matrix of x y positions or individual x y positions. The following example returns a slope of 1 for a diagonal matrix as expected:

public static void main(String[] args) {
    double a[][] = new double[9][9];
    for (int i = 0; i < 9; i++)
        a[i][i] = 1;

    SimpleRegression r = new SimpleRegression(true);

    addData(r, a);

    System.out.println("Slope = " + r.getSlope());
    System.out.println("Intercept = " + r.getIntercept());
}

public static void addData(SimpleRegression r, double[][] data) {
    for(int x=0; x<data.length; x++) {
        for(int y=0; y<data[0].length; y++) {
            for(int i=0; i<data[x][y]; i++) {
                r.addData(x, y);
            }
        }
    }
}

The example assumes that index 0 corresponds to a position of 0, index 1 corresponds to a position of 1 and so on. If this is not the case you need to add a function to transform index to position.

Chris K
  • 1,703
  • 1
  • 14
  • 26