11

I am searching a lot but could not find exactly what i need till now. I have two integer arrayas int[] x and int[] y. I want to find simple linear correlation between these two integer arrays and it should return the result as double. In java do you know any library function providing this or any code snippet?

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Surjya Narayana Padhi
  • 7,741
  • 25
  • 81
  • 130

2 Answers2

11

There is nothing in core Java. There are libraries out there you can use. Apache Commons has a statistical project, check PearsonCorrelation class.

Sample code:

public static void main(String[] args) {
    double[] x = {1, 2, 4, 8};
    double[] y = {2, 4, 8, 16};
    double corr = new PearsonsCorrelation().correlation(y, x);

    System.out.println(corr);
}

prints out 1.0

Neeme Praks
  • 8,956
  • 5
  • 47
  • 47
Hamed Moghaddam
  • 564
  • 3
  • 8
11

Correlation is quite easy to compute manually:

http://en.wikipedia.org/wiki/Correlation_and_dependence

  public static double Correlation(int[] xs, int[] ys) {
    //TODO: check here that arrays are not null, of the same length etc

    double sx = 0.0;
    double sy = 0.0;
    double sxx = 0.0;
    double syy = 0.0;
    double sxy = 0.0;

    int n = xs.length;

    for(int i = 0; i < n; ++i) {
      double x = xs[i];
      double y = ys[i];

      sx += x;
      sy += y;
      sxx += x * x;
      syy += y * y;
      sxy += x * y;
    }

    // covariation
    double cov = sxy / n - sx * sy / n / n;
    // standard error of x
    double sigmax = Math.sqrt(sxx / n -  sx * sx / n / n);
    // standard error of y
    double sigmay = Math.sqrt(syy / n -  sy * sy / n / n);

    // correlation is just a normalized covariation
    return cov / sigmax / sigmay;
  }
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • @Mvorisek: it can be either `i++` or `++i`; `++i` can be a bit faster (no need to return pervious state) on the old time compilers. Just a habbit from Intel 8086 times and `C` compilers for them... – Dmitry Bychenko Sep 05 '16 at 15:56
  • This doesn't cover the case were xs and ys have different length. – htellez Oct 18 '17 at 18:27
  • 1
    @htellez: correlation (or even covariation) wants equal lengths, or one should extend the standard correlation's definition. – Dmitry Bychenko Oct 18 '17 at 19:42