0

I'm building the pdf of a random variable and, in order to do so, I need to compute the integral of a function. The pdf looks like the following where I have intentionally separated the exponentials so to make it easier to read and distinguish the parts that depend from the x under integration.

enter image description here

The values of alpha, beta and miu are know and the summations are done on the values of t_k which are also given and known as

enter image description here

I have read countless posts, ppt presentations, etc about how to use vectorize in R but I have to admit that I still don't understand what's going on there.

I thought I had fixed the problem in my code but when I run it using the dataset available here (https://onedrive.live.com/redir?resid=BBBAA03E98D384C6!3542&authkey=!AF8x7Wudm0LTubc&ithint=file%2ccsv)

test<-probability.function(c(0.1, 5, 19), test.times, 30, 0.1)

it returns warning messages telling me that

Warning messages:
1: In values[i] <- integrate(Vectorize(integrand, vectorize.args = "x"),  ... :
number of items to replace is not a multiple of replacement length$

which makes me believe that I still haven't vectorized my function properly. Any suggestions on what I'm doing wrong? Also, where can I find a clear explanation (possibly with several examples) of why we need to vectorize a function in order to run integrate? I would really appreciate!

## Pdf function for the random variable (1-D)

probability.function=function(paramss, event.times, length.interval, granularity.interval) {

tt<-sort(event.times)
li<-length.interval
gi<-granularity.interval  
s<-seq(tail(tt,1), tail(tt,1)+li, by=gi) #seed.times on which I compute the value of the pdF (starting point is the last occurence time)

parr1<-paramss[1]
parr2<-paramss[2]
parr3<-paramss[3]

integrand <- function(x, top, params, data, seed.time) {

    t<-sort(data)
    par1<-params[1]
    par2<-params[2]
    par3<-params[3]
    top<-top
    s<-seed.time

    value <- (par1 + sum(par2*exp(-par3*(x - t)))) * 
             (par1 + sum(par2*exp(-par3*(top - t))) + par2*exp(-par3*(top - x))) * 
             exp(-par1*(x-tail(t,1))) * 
             exp(-par2/par3*sum(exp(-par3*(tail(t,1)-t)) - exp(-par3*(top-t)))) *  
             exp(-par2/par3*(1-exp(-par3*(top-x))))            
    return(value)
}

values<- rep(0,length(s))
for(i in 1:length(values)) {
    values[i] <- integrate(Vectorize(integrand, vectorize.args='x'), lower=tail(tt,1), upper=s [i], top=s[i], params=c(parr1, parr2, parr3), data=tt, seed.time=s[i])
}

return(values)
}
g_puffo
  • 613
  • 3
  • 11
  • 21
  • `integrate` needs to be able to accept a vector of x-values and generate a vector of y-values ... hence the need for the integrand to be "vectorized". – IRTFM Nov 08 '14 at 23:09
  • @BondedDust I understand that there are two ways of making sure that a user-defined function is "vectorized": one is to use the "Vectorize" command (like I have tried doing) and the other is to re-write the function, correct? But what does it mean to re-write the function in a "vectorized" form? isn't the equation of the function the same regardless of where it is evaluated? – g_puffo Nov 08 '14 at 23:23
  • If you use all vectorized functions in the expression, then the returned object will be a vector of the same length. (`sum` is not 'vectorized' but `+` is.) – IRTFM Nov 08 '14 at 23:27
  • So, the difference is that if I have a vector, say `B<-c(0.7,0.3,0.7,0.4)` doing `B+1` will add `1` to every element of `B` and still return a vector, while if I do `sum()` on a vector it will return me a number? – g_puffo Nov 08 '14 at 23:45
  • ha, do you actually mean that if I do `sum(c(1,2,3), c(5,6,7))` I get a number rather than a vector? – g_puffo Nov 08 '14 at 23:53
  • So, if I understand the problem correctly, and after doing a test with a very simple function, re-writing `sum()` with a simple `for` loop seems to do the job, right!? The problem is that I have 3 `sum()` in my integrand which means that I would have to make 3 nested loops? – g_puffo Nov 09 '14 at 00:49
  • I will say that using `sort(x)` and then taking `tail(,1)` makes me wonder why you aren't using `max`? Also not clear why sum() is in there. The 'x' in the integrand is supposed to be one value. – IRTFM Nov 09 '14 at 04:49
  • Sample input for all your function variables and the desired output would help make this easier for programmers that are rusty on their math to help. – ARobertson Nov 09 '14 at 17:47
  • @BondedDust: would using `max` make it faster or better in some way? Also, I'm not sure I understand your remark on `x` and `sum`: the function is a summation of exponentials that have the variable `x` in it... @ARobertson: good idea! I will do it tomorrow as soon as I get back to the office! – g_puffo Nov 10 '14 at 03:38
  • Well, tt's good to know you're not taking your work home on the weekends. – IRTFM Nov 10 '14 at 05:05
  • BondeDust: I never do! I don't want to become a 65 year old frustrated-californian posting snarky remarks online ;) – g_puffo Nov 10 '14 at 05:25
  • @ARobertson: I have edited the post so to make it more readable and re-written my R code so to make it more clear; I have also posted a link to the datafile that one can use to run my code. Unfortunately, I can't post a "desired" output since I will have to compare my output to a theoretical pdf and see if they match... so there isn't a clear "desired" output. – g_puffo Nov 10 '14 at 21:03
  • To remove the error and see where the/a problem is, you can add [1] at the end of the values[i] <- integrate...) line. values[i] is expecting one value each iteration, and your function sends it an entire vector but only takes the first and warns you that it is doing so. I doubt that you need to use the Vectorize function. Take a look at the third ## in the integrate() examples. The last 2 lines are both vectorized though the second does not need the Vectorize function. – ARobertson Nov 11 '14 at 20:22

0 Answers0