0

I have a dataset and the dataset normalized to the maximum value (values between [0,1] and I try to fit gamma distribution. I am using fitdistrplus and I am estimating the parameters of the distribution while I get the loglikelihood values and AIC and BIC. Using my data the loglikelihood is negative and when my data is normalized is positive. Can you tell me why? Also, the shape parameter seem to be similar while the rate it is not. Any comment on that? Thank you

data <- c(130, 200, 830, 380, 680, 260, 280, 219, 330, 77, 360, 170, 240, 110, 170)

fit_gammaB <- fitdist(data, "gamma")
> summary(fit_gammaB)
Fitting of the distribution ' gamma ' by maximum likelihood 
Parameters : 
         estimate  Std. Error
shape 1.784525060 0.571213823
rate  0.006316464 0.002271273
Loglikelihood:  -98.38866   AIC:  200.7773   BIC:  202.1934 
Correlation matrix:
          shape      rate
shape 1.0000000 0.8519429
rate  0.8519429 1.0000000

And when my data is normalized to the max value:

> fit_gammaB <- fitdist(data_norm, "gamma")
> summary(fit_gammaB)
Fitting of the distribution ' gamma ' by maximum likelihood 
Parameters : 
      estimate Std. Error
shape 1.784173   0.600506
rate  5.241396   2.034361
Loglikelihood:  2.432731   AIC:  -0.8654627   BIC:  0.5506377 
Correlation matrix:
          shape      rate
shape 1.0000000 0.8671602
rate  0.8671602 1.0000000
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
MariaK
  • 1
  • 1
  • Check the output in the first example (not normalized); I get a different result. – Warren Weckesser Sep 23 '22 at 16:51
  • Did you normalize `data` as `data_norm <- data / max(data)`? If I do that, I get a different result than what you show for the fit to the normalized data. – Warren Weckesser Sep 23 '22 at 16:56
  • If I run your example, and create `data_norm` as in my previous comment, I get the the same shape parameter (to 4 significant figures) in both cases (2.718...), and the `rate` parameters differ by the multiplicative factor 830. Since 830 is the maximum of `data`, that is the value that `data` was divided by to create `data_norm`, the difference in `rate` is exactly what one would expect it to be. – Warren Weckesser Sep 23 '22 at 17:01
  • Hi, thank you for your time and you are abolutely correct ....I was missing one number in my data. Correct data is data<-c(130 200 830 380 680 260 280 21 330 77 360 170 240 110 170). Yes, I normalize by the relationsgip that you mentioned. Do you have any comment on the assessment of the positive and negative loglikelihood, AIC and BIC? – MariaK Sep 26 '22 at 10:21

0 Answers0