0

I am doing panel data analysis with 17 variables in R using the package "plm".
I have to eliminate these variables while retaining the most significant of them. I am looking at adjusted R-square for the set of variables that best explain my dependent variable. Since I have 17 variables, repeating and observing again and, again has become cumbersome. Following is my code:

attach(pdf) 
pdata <-plm.data(pdf,index=c("country","day")) 
Y <- cbind(DEP_var) 
var_list <- pdf[c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q")]
between_models= list() 
R_Sqrt=c()
for(i in 1:17){
X<-cbind(var_list[,1:i])
between_models[i]=plm(Y~ X, data=pdata, model= "between")
R_Sqrt[i]=coef(between_models[i])["Adj. R-Squared"]
}
print(paste("Least  Adj. R-Squared is",which.max(R_Sqrt))
print(between_models[[which.max(R_Sqrt)]]) # print least  Adj. R-Squared model

What I am trying to do with the above code is to increase the number of variables in Y and estimate the between model again and again till the Y has the maximum number of variables. And then look at the list of adjusted R-square values and pick the summary for the model with the highest adjusted R-square. When I run the above code it gives the following error:

Error in model.frame.default(terms(formula, lhs = lhs, rhs = rhs, data = data,  : invalid type (list) for variable 'X'

In the above code for loop, it seems that there is a problem in type of the variable X. Please suggest how to fix it so the loop runs properly and give the least adjusted R-square model as the result.

Polar Bear
  • 731
  • 1
  • 7
  • 21
  • How would one generally write a loop for finding the best feature model based on adjusted R square if it was a simple multivariate regression? – Polar Bear Apr 25 '16 at 03:22
  • It seems that there is problem in type of the variable X. Please suggest how to fix it – Polar Bear Apr 25 '16 at 04:44

0 Answers0