-3

click here for the outputi tried the code below for caret train an rpart plot but only one leaf is formiong can anyone tell why is this happening the code i tried is making a caret train control set and then made a rpart train set along with the function used below then i tried to plot the rpart plot with the prp function then only one leaf is being formed the output i got is there in the image link above first line.

 [> 





     structure(list(source = structure(c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 


                    7L, 7L, 7L), .Label = c("IN", "MA", "NR", 
                        "OT", "PA", "P", "R", 
                        "S", "U", "Z"), class = "factor"),age = structure(c(2L, 1L, 1L, 
2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("L17", 
"U17"), class = "factor"),, name = structure(c(3L, 2L, 2L, 
                        1L, 2L, 3L, 1L, 1L, 2L, 2L), .Label = c("f", "l", "s", 
                        "v", "z"), class = "factor"), success = structure(c(1L, 
                        1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), 
                            day = structure(c(6L, 6L, 7L, 7L, 5L, 5L, 1L, 1L, 1L, 1L), .Label = c("Friday", 
                            "Monday", "Saturday", "Sunday", "Thursday", "Tuesday", "Wednesday"
                            ), class = "factor"), country = structure(c(6L, 2L, 4L, 2L, 
                            2L, 4L, 1L, 2L, 7L, 2L), .Label = c("A", "C", 
                            "I", "Other", "S", "Ua", "U"
                            ), class = "factor")), row.names = c(NA, -10L), class = c("data.table", 
                        "data.frame"), .internal.selfref = <pointer: 0x0000000000101ef0>)





                   k<-ow
                            > str(k)
                           Classes ‘tbl_df’, ‘tbl’ and 'data.frame':    1898 obs. of  6 variables:
     $ source : Factor w/ 10 levels "I",..: 7 7 7 7 7 7 7 7 7 7 ...
     $ age    : Factor w/ 2 levels "L17","U17": 2 1 1 2 1 1 2 1 1 1 ...
     $ name   : Factor w/ 5 levels "f","l",..: 3 2 2 1 2 3 1 1 2 2 ...
     $ success: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
     $ day    : Factor w/ 7 levels "Fri","Monday",..: 6 6 7 7 5 5 1 1 1 1 ...
     $ country: Factor w/ 7 levels "A","C",..: 6 2 4 2 2 4 1 2 7 2 ...
                             - attr(*, ".internal.selfref")=<externalptr> 
                            > k.label<-k$success
                            > set.seed(37569)
                            > cv.3.folds<-createMultiFolds(k.label,k=3,times=10)
                            > ctrl.3<-trainControl(method = "repeatedcv",number = 3,repeats = 10,index=cv.3.folds)
                            >k.train.1<-k[,c("age","source","day")]
                            #i tried using rpat.oc function which is given down
                            > k.cv<-rpart.oc(94622,k.train.1,k.label,ctrl.3)
                           Warning messages:
        1: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 8 (<-activate.adobe.com:11086)
        2: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 7 (<-activate.adobe.com:11086)
        3: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 6 (<-activate.adobe.com:11086)
        4: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 5 (<-activate.adobe.com:11086)
        5: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 4 (<-activate.adobe.com:11086)
        6: In .Internal(gc(verbose, reset, full)) :
          closing unused connection 3 (<-activate.adobe.com:11086)
        7: Setting row names on a tibble is deprecated.  

                           > prp(k.cv$finalModel,type=0,extra=1,under=TRUE)



                            > View(rpart.oc)
                            function(seed,training,labels,otrl){
                            ol<-makeSOCKcluster(6,type="SOCK")
                            registerDoSNOW(ol)
                            set.seed(seed)
                            rpart.oc<-train(x=training,y=labels,method="rpart",tuneLength=30,trControl=otrl)
                            stopCluster(ol)
                            return(rpart.oc)
                            }
  • 2
    We don't have any of your data (*actual* data, not just a printout of its structure) and we can't see any of your output. There's no way for anyone else to know what's going on here. – camille Apr 17 '19 at 20:34
  • @camille can you see and help now i added the dput of data actual data is confidential. the output i got is there in the image link – raju varasala 1801166 Apr 18 '19 at 05:36

2 Answers2

2

There are times when the CART process cannot find a split that predicts any better than an intercept only model (e.g. the sample mean for regression or mode for classification). It basically means that you have no informative predictors for the CART procedure.

For example:

library(rpart)

dat <- data.frame(y = 1:10, x = rep(1:2, 5))
rpart(y ~ x, data = dat)
#> n= 10 
#> 
#> node), split, n, deviance, yval
#>       * denotes terminal node
#> 
#> 1) root 10 82.5 5.5 *

Created on 2019-04-20 by the reprex package (v0.2.1)

topepo
  • 13,534
  • 3
  • 39
  • 52
  • but when i tried to form a decision tree there were some branches :fit <- rpart(survived~., data = data_train, method = 'class') rpart.plot(fit, extra = 106) – raju varasala 1801166 Apr 22 '19 at 05:34
0

I had this same problem. Here is a link to the question I asked and received an answer to. Hopefully that helps.

Sintrias
  • 456
  • 1
  • 9
  • 19