3

I need to put in code a piecewise function and store generated values in a data frame. The rules are the following:

  • I have an object X that is generated by a Bernoulli(1/3).
  • If X=0, another object, Y, is generated by E = Exponential(1).
  • If X=1, Y is generated by E if E <= P, and by (P + EL) if E > P, where P is a constant (1, for example) and EL = Exponential(Lambda) independent from E.
  • I want to generate a data frame with 100 samples of X and Y obtained using this method, and additionally, do this process 10000 times (or in other words, generate 10000 data frames).

I tried to do something like this, but as I'm a total novice using R, I couldn't figure out how to define properly each element, and obviously, how to store my results in a data frame.

Here's the "code" I did:

test <- function(lambda,p) {
for (i in 1:10000) { # Number of simulations that I want to do.
for(j in 1:100) { # Sample size of each simulation / data frame.
  x <- rbinom(1,1,1/3)
  e <- rexp(1,1)
  if (x==0) {y <- e}
  else {
    if (e<=p) {y <- e}
  else {y <- p + rexp(1, lambda)}
    }
  }
}    

But even before testing I know it's impossible that it works properly. The data frame that I want to make only may contain values for X and Y.

I know this could be a very basic question, so thank you very much for your answers.

anxoestevez
  • 190
  • 2
  • 18

1 Answers1

3

I think:

simfun <- function(n=100,lambda=1,P=1,ret.df=TRUE) {
    X <- rbinom(n,prob=1/3,size=1)
    E <- rexp(n,1)
    EL <- rexp(n,lambda)
    Y <- ifelse(X==0,E,
                ifelse(E<=P,E,P+E*EL))
    ## return data frame or matrix
    if (ret.df) data.frame(X,Y) else cbind(X,Y)
}
system.time(res <- replicate(1e4,simfun(),simplify=FALSE))
## 6 seconds

## or:
library(plyr)
system.time(res2 <- raply(1e4,simfun(ret.df=FALSE),.progress="text")  
## returns a 1e4 x 100 x 2 array, maybe more convenient: use
## rlply instead of raply (and ret.df=TRUE) to match the previous
## results
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • +1. Many thanks to you! But, what about the way each data frame is generated? I mean... If I need to apply a test for every data frame, what will be the name or indicator for each one? And another thing. If I want to do Z simulations, I only need to put this function between `{}`in a `for (i in 1:Z)` argument? How about "evaluate" each data frame generated? – anxoestevez Jul 19 '13 at 01:59
  • 1
    you can `lapply()` to apply a test for every data frame, or index them as `res[[i]]` where `i` goes from 1 to 1e4 ... as for Z simulations -- do you need yet *another* level of looping ?? The code above is already generating 10,000 data frames ... – Ben Bolker Jul 19 '13 at 02:03
  • I get it! Yesterday I asked you this because I couldn't check how works the code you wrote, and I misinterpreted it. Thanks! – anxoestevez Jul 19 '13 at 07:29