0

I am trying to implement a bayesian model in R using bas package with setting up these values for my Model:

databas <- bas.lm(at_areabuilding ~ ., data = dataCOMMA, method = "MCMC", prior = "ZS-null", modelprior = uniform())

I am trying to predict area of a given state with the help of certain area present for that particular state; but for different zip codes. My Model basically finds the various zip codes present in the data for a given state(using a state index for this) and then gives the output.

Now, Whenever I try to predict area of a state, I give this input:

> UT <- data.frame(zip = 84321, loc_st_prov_cd = "UT" ,state_idx = 7)
> predict_1 <- predict(databas,UT, estimator="BMA", interval = "predict", se.fit=TRUE)
> data.frame('state' = 'UT','estimated area' = predict_1$Ybma)

Now, I get the output for this state. Suppose I have a list of states with given zip codes and I want to run my Model (databas) on that list and get the predictions, I cannot do it by using the above approach as it will take time. Is there any other way to do the same? I did the same by the help of one gentleman and here is my code:

 pred <- sapply(1:nrow(first), function(row) { predict(basdata,first[row, ],estimator="BMA", interval = "predict", se.fit=TRUE)$Ybma })

basdata: My Model first: my new dataset for which I am predicting area. Now, The issue that i am facing is that the code is taking a long time to predict the values. It iterates over every row and calculates the area. There are 150000 rows in my dataset and I would request if anyone can help me optimizing the performance of this code.

  • I'm not exactly sure what you're asking. Do you mean that you want to get predictions for many combinations of a state and a zip code? If so, where are you getting those combinations from? Are they already in a list or a data frame? We're more likely to be able to help if you can be more specific about the problem you're trying to solve and the structure of your data. – eipi10 Jun 17 '20 at 22:31
  • Yes, I want to get predictions for many combinations of state, zip code and state index. I am currently getting these combinations from a csv file, loaded into a variable as a Data frame. I am getting just 130000 prediction while the file is of 150000 numbers of data ! – Uttasarga Singh Jun 17 '20 at 22:34
  • In the 2nd code snippet, I am using a variable UT and giving the input of zip, state name and state index. I have a csv file of many zips, state name and state index, and I want to predict the area with the help of my model - databas. – Uttasarga Singh Jun 17 '20 at 22:43

1 Answers1

0

Something like this will iterate over each row of your data frame of states, zips and indices (let's call it states_and_zips) and return a list of predictions. Each element of this list (which I've called pred) goes with the corresponding row of state_and_zips:

pred = lapply(1:nrow(states_and_zips), function(row) {
  predict(databas, ~ states_and_zips[row, ], 
          estimator="BMA", interval = "predict", se.fit=TRUE)$Ybma
})

If Ybma is a single value, then use sapply instead of lapply and it will return a vector of predictions, one for each row of state_and_zips that you can just add as a new column to states_and_zips.

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • Ybma would be list of prediction of areas of different states that is given as an input. – Uttasarga Singh Jun 18 '20 at 01:01
  • Hello Sir, I have my data in noArea1; stored as a data frame. I have applied sapply instead of lapply as you suggested. basdata is my predictor model. I would let you know if this works; could you check once if what I have written is correct? pred <- sapply(1:nrow(noArea1), function(row) { predict(basdata,noArea1[row, ],estimator="BMA", interval = "predict", se.fit=TRUE)$Ybma }) – Uttasarga Singh Jun 18 '20 at 15:01
  • Hello, I used the code which you developed above for predicting area for a small data set of 20 entries. It worked! However, I am trying for the original data set(150000 entries) and the code is still running. Should it be consuming that amount of time? – Uttasarga Singh Jun 18 '20 at 17:55