2

I have a dataset with the number of occurrences by month and would like to calculate the centre of the distribution/mean month. If possible, I would also like to have confidence intervals.

I have read the manuals for circular and CircStats and have looked at similar questions here and here.

I managed to get a seemingly sensible result in some cases but not in others and have not yet figured out how to calculate the confidence interval.

To illustrate my point, here is some dummy data:

library(CircStats)

# The number of observations by month (Jan-Dec):
obsMonths1 <- c(12,15,1,2,3,1,1,4,1,2,7,1) 
obsMonths2 <- c(1,1,1,1,2,10,11,2,1,1,2,1)

# Convert data to radians:
obsRadians1 <- (obsMonths1/12*2)*pi
obsRadians2 <-(obsMonths2/12*2)*pi

# Calculate circular mean:
mean1 <- circ.mean(obsRadians1-1)#assume January is 0
mean2 <- circ.mean(obsRadians2-1)#assume January is 0

# Convert radians to months:
mean1*12/(2*pi)+12 
mean2*12/(2*pi)+12 

For the first set of observations the answer seems sensible, but for the second set of observations it should be July-August.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

2 Answers2

0

It looks like your code has no issue. You asked "confidence interval", but it doesn't seem like you have done any statistical tests or bootstrapping procedures. I am not sure how you would calculate confidence interval without these results first. If you're sure that you've done the right thing, you might want to raise this question in a math/stat-specific StackExchange community.

David C.
  • 1,974
  • 2
  • 19
  • 29
  • Thanks for the bootstrapping suggestion - I'm looking into that. I agree that the code runs without error, but the answer for the second set of dummy data isn't correct. Any insight in where I'm going wrong? –  Feb 05 '17 at 13:46
  • I am not entirely sure why you didn't get the correct result. I would suggest that you make sure `obsRadians2` and `mean2` are appropriately defined, as the date/year conversion could be error-prone. – David C. Feb 05 '17 at 17:17
0

tl;dr I agree that your calculations are basically correct; I think it's your intuition that's wrong, as I illustrate below with some pictures.

library(CircStats)
# The number of observations by month (Jan-Dec):
obsMonths1 <- c(12,15,1,2,3,1,1,4,1,2,7,1) 
obsMonths2 <- c(1,1,1,1,2,10,11,2,1,1,2,1)

I did the conversion to and from radians slightly differently; using the modulo (%%) operator automatically converts 12 to zero. I added 11 to make January==0 but keep everything positive ...

to_rad <- function(x) (x+11 %% 12)/12*2*pi
## check results
stopifnot(to_rad(1)==0,to_rad(7)==pi,to_rad(4)==pi/2,to_rad(10)==3*pi/2)

And convert back:

from_rad <- function(x) (12/(2*pi)*x)+1
## check round-trip with an arbitrary number
stopifnot(isTRUE(all.equal(from_rad(to_rad(7.931)),7.931)))

Conversion:

(m1 <- from_rad(circ.mean(to_rad(obsMonths1))))  ## 1.77
(m2 <- from_rad(circ.mean(to_rad(obsMonths2))))  ## 0.93

Bootstrapping code:

bootquant <- function(x,n=1000,alpha=0.05) {
    bootsamp  <- replicate(n,
                           from_rad(circ.mean(to_rad(sample(x,replace=TRUE)))))
    qq <- quantile(bootsamp,c(alpha/2,1-alpha/2))
    names(qq) <- c("lwr","upr")
    return(qq)
}
(bq1 <- bootquant(obsMonths1))
##      lwr      upr 
## 1.076794 2.670130 
(bq2 <- bootquant(obsMonths2))
##      lwr      upr 
## 0.231873 1.414766 

I'm not sure I would trust bootstrapping for such small data sets; you might also check the ?circ.disp function from CircStats ...

library(ggplot2)
dd <- data.frame(OM1=obsMonths1,OM2=obsMonths2)
ggplot(dd,aes(x=OM1,y=1))+stat_sum()+coord_polar()+
    scale_x_continuous(limits=c(0,12), breaks=c(0,3,6,9,12))+
    annotate(geom="point",y=1,x=m1,colour="red")+
    annotate(geom="segment",x=bq1[["lwr"]],xend=bq1[["upr"]],y=1,yend=1,colour="red")

ggplot(dd,aes(x=OM2,y=1))+stat_sum()+coord_polar()+
    scale_x_continuous(limits=c(0,12), breaks=c(0,3,6,9,12))+
    annotate(geom="point",y=1,x=m2,colour="red")+
    annotate(geom="segment",x=bq2[["lwr"]],xend=bq2[["upr"]],y=1,yend=1,colour="red")

enter image description here

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453