4

I am trying to identify at which point during the year 80% of bird breeding observations are. Typically I would answer a question like this by finding the median or quartiles, but how do I deal with situations where 80% of the observations are from day 285 through day 366 (leap year) and extends to day 30?

The data is circular and day 365 is as close to day 366 as day 1.

I am reading the manual for CircStats and circular but I could really use some help on this.

My questions are: what is the shortest number of days where 50% of the observations are and what is the start day and end day of that period?

Here is some dummy data:

library(CircStats)
#dummy data
obsDay<-c(rep(1:30,10),rep(45:65,2),65:180,
    rep(181:265,2),rep(266:330,4),rep(331:366,6))
#density plot
plot(density(obsDay))
#convert data to Radians
obsRadians <-(obsDay/366*360)*3.1459 / 180
#make a circular plot
circ.plot(obsRadians, stack=TRUE, bins=100,shrink=1.8)
daisy
  • 93
  • 8
  • This is an interesting question. I can't beleive some ... downvoted it. – Mike Wise May 27 '15 at 10:47
  • Are you looking for the 50 percent or the 80 percent point? You have both in the question. – Mike Wise May 27 '15 at 10:50
  • Ah, you mean cross-validated. I didn't know it had two names... – Mike Wise May 27 '15 at 10:55
  • @mike I'm interested in calculating a variety percentiles focused on the days where observations are most frequent. Many of my species have bimodal breeding patterns. Thanks for thinking it is an interesting question. – daisy May 27 '15 at 11:44
  • Well, you could write it as a double for-loop, one increaseing the number of days in your interval, the other over all the possible start dates, and terminate when you get 50 percent. That would find your answer though it is probably not the most efficient. – Mike Wise May 27 '15 at 11:49
  • I wonder if your obsDay does what you think it does. For example rep(45:65,2) means create two sequences from 45 to 65. In total obsDay has a length of 1104. Is that what you intended? I think not. – Mike Wise May 27 '15 at 12:27
  • It is what I intended. The numbers are the decimal day of the year when the first day of breeding was observed. – daisy May 27 '15 at 12:42
  • Post the plots too. Will help people understand the issue. – Mike Wise May 27 '15 at 12:46

1 Answers1

3

I have figured out the solution. With the dummy data, the days where 80% of data is most dense is between Oct 6 and July 23

library(circular)
library(CircStats)
obsDay<-c(rep(1:30,10),rep(45:65,2),65:180,rep(181:265,2),rep(266:330,4),rep(331:366,6))
#density plot
plot(density(obsDay))
#convert data to Radians
obsRadians <-(obsDay/366*360)*pi / 180
#make a circular plot
circ.plot(obsRadians, stack=TRUE, bins=100,shrink=1.8)
quant<-quantile.circular(obsRadians,c(0.10,.90)) ## for interval of 80% of obs
start<-(quant[[1]]*180/pi)/360*366 #convert radians to days - Aug 27
end<-(quant[[2]]*180/pi)/360*366 # March1
daisy
  • 93
  • 8