0

Concatenate columns name of a list to prepare a formula for rpart?

Just wanted to concatenate the names(log_data), log_data is a list of 60 vectors distinct vectors, so I just want their column names in a format so that I can put them in a formula of rpart in r..... like rpart(A ~ B + C + D + E ,log_data), so here I just want to extract formula="A~B+C+D+E" as a whole string where A,B,C,D,E are the columns name which we have to extract from the log_data, or is there any better way to get a tree from the list.
I have tried,

a <- names(log_data)  
rpart(a[1] ~ a[2] + a[3] + a[4], log_data)

getting an error

Error in paste(temp, yprob[, i], sep = " ") : subscript out of bounds

where

a[2]

[1] "X.u.crpice..vin20f1..vol.vin20f1v1.r_credit_credshare2...91...90."

a[3]

[1] "X.u.crpice..vin20f1..vol.vin20f1v1.r_credit_credshare2...92...90."

c<-paste(a[1], "~", sep="")

rpart_formula <- as.formula(paste(c, paste(a[2:60], collapse = " + "), sep = ""))

rpart(rpart_formula,log_data)

it is going in infinite loop at rpart just because of too long column name or may be n=60

Can I attach any column names colnames(log_data) <- c(?), what should I put at "?", so that will be easy to draw it for n=60.

blahdiblah
  • 33,069
  • 21
  • 98
  • 152
Aashu
  • 1,247
  • 1
  • 26
  • 41
  • `o(1)`? I assume you mean `O(1)` – amit Nov 24 '12 at 20:43
  • yes sir but it will be ohkk if i get any suitable method too ,to implement it . – Aashu Nov 24 '12 at 20:45
  • does your new solution work OK for a smaller set of predictors? At what length does it break down? Does `rpart(response~.,data=log_data)` work? (How do you know it's in an infinite loop and not just taking a long time?) – Ben Bolker Nov 24 '12 at 21:48
  • yes it is working fine if i take the column name small by length and small snumber of column name in r formula – Aashu Nov 24 '12 at 21:51
  • actually i want to replace the large column name of file with order characters. so that there will be no array bound problem.thank you – Aashu Nov 25 '12 at 06:50

1 Answers1

1

I believe you want

shortnames <- paste0("c",seq(ncol(log_data)))
names(log_data) <- shortnames
form <- reformulate(paste(shortnames[2:4],collapse="+"),
                    response=shortnames[1])
rpart(form,log_data) 
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • sir i got this but this is going in infinite loop ,i have implemeted this alsoc<-paste(a[1], "~", sep="") rpart_formula <- as.formula(paste(c, paste(a[2:60], collapse = " + "), sep = "")) – Aashu Nov 24 '12 at 21:30
  • hard to say what's going wrong without a reproducible example. Does your alternative solution also fail, or did that work? – Ben Bolker Nov 24 '12 at 21:41
  • nope it did not may be just because of error due to array bound and unable to change the column names as in integer as well – Aashu Nov 24 '12 at 21:54
  • if this works for you, would you accept it? If you came up with your own solution, you can post it as an answer (it's acceptable to answer your own question ...) – Ben Bolker Nov 27 '12 at 16:17
  • dont know what was the exact problem but when i restart again it was working fine ,thank you sir . – Aashu Dec 15 '12 at 13:26