2

I try to implement plenty of time already (haha) Anderson-Darling normality test into C++. Here is my code. I know there is similar topic here but unfortunatelly it did not solve my problem.

Variance is calculated properly, uniform standartized distribution too I would guess. The problem is that b and c gives me NAN for my sample data 1, 2....10.

Would you have any idea where is the error in the formula - Anderson_Darling() see code below?

The code is cut from the class for better clarity. I did not put here obvious methods like mean() etc.

double variance()
{
    double var_sum = 0.0;
    for(int i = 0; i < int(size); i++)            //(size is taken from a class)
        var_sum += pow(data.at(i)-mean(),2);
    return var_sum / (int(size)-1);
}

double phi(double x)
{
    double res =0.5 * erfc(-x * M_SQRT1_2);
    return res;
}

vector<double> tostdnormal()
{
    vector<double> Y (size);
    for(int i = 0; i < int(size); i++)
        Y.at(i) = (data.at(i) - mean())/(sqrt(variance()));
    return Y;
}

double Anderson_Darling()
{
    sort(data.begin(),data.end());
    int n = int(size);
    vector<double> Y = tostdnormal();

   double S = 0; double a = 0; double b = 0; double c = 0;
    for(int i = 0; i < n; i++)
    {
        a = 2.0 * (i+1) - 1;
        b = log(Y.at(i));
        c = log(1-Y.at(n-i-1));
        S += a * (b + c);
     }
return -n - S / n;


 }

Update - I changed b and c to this, and I get the output I expected.

b = log(phi(Y.at(i)));
c = log(1-phi(Y.at(n - i - 1)));
vlad
  • 193
  • 2
  • 10
  • `Y[i]` can be negative and you are taking the `log`. Besides, I cannot see where you are using the `phi` function – Damien Mar 12 '20 at 16:32
  • You are right, this is my version nr. 56 :-D. I will try corrected code tomorrow nevertheless I think there will be problem with log(negative value) it gives complex number. How to deal with this? – vlad Mar 12 '20 at 19:30
  • At the wikipedia page, they mention to take the log of the `F(Y[i])`, always positive, where `F()` is the CDF, not of the `Y[i]` – Damien Mar 12 '20 at 20:05
  • Pls correct me when I am wrong, but taking F(Y[i]) can result into negative value for N(0, 1). Red curve: https://en.m.wikipedia.org/wiki/Normal_distribution#/media/File%3ANormal_Distribution_PDF.svg – vlad Mar 12 '20 at 20:28
  • Actually I take the CDF https://en.m.wikipedia.org/wiki/Normal_distribution#/media/File%3ANormal_Distribution_CDF.svg so this should be positive. – vlad Mar 12 '20 at 20:30
  • 3
    If I understand well, `phi` is the CDF, but where are you using it? And a CDF is always positive (or null) – Damien Mar 12 '20 at 20:34
  • 2
    I corrected the program since phi() is missing there, and now it calculates properlty. Thx Damien. – vlad Mar 13 '20 at 07:28

0 Answers0