-2

Here, I would like to have your helps on implementing calculation of integral on two vector. I checked pages on integral calculation relative to R. But, I have few training on mathematics, so I still can not do that by myself.

My objectives is to implement the idea of this sentence "If you plot the rate estimates by position, the genetic map is just the integral of this plot." This mean I have variables (rates, positions), each position have a rate of its own. I want to calculate the integral of rates for each position. Here, the position is Monotonically increasing.

This task should not be so complex for those who have good background on mathematic computation. So, could you please give me any directions/instructions on that?

Thanks in advance.

# here I make dummy data

position <- c(2,34,58)
rate <- c(14, 20, 5)  
Andrie
  • 176,377
  • 47
  • 447
  • 496
jianfeng.mao
  • 945
  • 3
  • 13
  • 21
  • 3
    Erm, if by "integral" you mean the Riemann integral (http://en.wikipedia.org/wiki/Integral), then that's for functions that are continuous (or, to be technical, functions that have a countable number of discontinuities). So, I'm afraid that your question doesn't make a lot of sense as asked. –  Sep 09 '11 at 08:07
  • It is not homework. It is in fact work. I need it for my research. Thanks for your reply. – jianfeng.mao Sep 09 '11 at 08:24
  • To Jack, I have tens of millions pairs of such position and rate. I have not fully grasp your mean. Someone tell me that I can calculate the integral by using the trapezoidal rule. I have so few background knowledge on the issue, that I have not idea about everything. – jianfeng.mao Sep 09 '11 at 08:28
  • sorry. I asked a naive question here, so that I got negative one. But, in fact I can not realize the reason for that. – jianfeng.mao Sep 09 '11 at 08:31
  • Reason is probably because we do not exactly know what you mean by calculating the integral of two vectors. What is the “genetic map”? – mzuba Sep 09 '11 at 09:58

1 Answers1

4

In mathematics, an integral is the area under the curve. In your example, you want the area under the curve as defined by position and rate.

position <- c(2,34,58)
rate <- c(14, 20, 5)  

plot(position, rate, type="l", ylim=c(0, 25))

enter image description here

You can calculate the area under the curve by hand, using the trapezoidal rule:

32*17 + 24*12.5 = 844

Or, to do it programmatically:

AUC <- function(x, y){
  sum(diff(x)*rollmean(y,2))
}

AUC(position, rate)
[1] 844
Andrie
  • 176,377
  • 47
  • 447
  • 496