0

I have a big data with more than 20 millions values, due to privacy and making the codes reproducible, I use mydata to replace it.

set.seed(1234)
mydata <- rlnorm(28000000,3.14,1.3)

I want to find which known distributions fit mydata best, so function fitdist in package fitdistrplus is choosen.

library(fitdistrplus)
fit.lnorm <- fitdist(mydata,"lnorm")
fit.weibull <- fitdist(mydata, "weibull")
fit.gamma <- fitdist(mydata, "gamma", lower = c(0, 0))
fit.exp <- fitdist(mydata,"exp")

Then, I use ppcomp function to draw P-P plot to help me choose the best fitted distribution.

library(RColorBrewer)
tiff("./pplot.tiff",res = 300,compression = "lzw",height = 6,width = 10,units = "in",pointsize = 12)
ppcomp(list(fit.lnorm,fit.weibull, fit.gamma,fit.exp), fitcol = brewer.pal(9,"Set1")[1:4],legendtext = c("lnorm","weibull", "gamma","exp"))
dev.off()

pplot Absolutely, lognormal fits mydata best, but take a look at the legend of the plot, the line annotation with different colors is missing, only text annotation shows, what should I do?

I try some datasets with few values, and it worked. So the big data leads to the question, what should I do to make the legend perfect?

Mike Stockdale
  • 5,256
  • 3
  • 29
  • 33
Ling Zhang
  • 281
  • 1
  • 3
  • 13

1 Answers1

0

A lot of function questions could be done by fix(function), in this way, we could know how the function works.

fix(ppcomp)

And I find some codes about legend,

if (addlegend) {
    if (missing(legendtext)) 
      legendtext <- paste("fit", 1:nft)
    if (!largedata) 
      legend(x = xlegend, y = ylegend, bty = "n", legend = legendtext, 
        pch = fitpch, col = fitcol, ...)
    else legend(x = xlegend, y = ylegend, bty = "n", legend = legendtext, 
      col = fitcol, ...)
  }

Then, I add lty=1 to the legend, and it works.

Ling Zhang
  • 281
  • 1
  • 3
  • 13