0

The summary(model) command never completes.

K-fold cross validation was ran as follows (R caret package):

train_control <- trainControl(method="repeatedcv", number=10, repeats=0)  
model <- train(as.factor(OneGM) ~., data=OneT, trControl=train_control, method="C5.0")

The dataset has 3,000 rows of 6 attributes. The train command completes in 10 seconds. When using a single model trained with C5.0 directly, not using caret, the model accuracy is 68%.

How do I debug why the summary(model) command won't complete? The R-Studio Stop icon is available, and it works. The command is not completing within 30 minutes.

UPDATE: summary(model$finalModel) never completes either.

Thanks for any help.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • can you do cat(model$finalModel$output), do you see anything? – StupidWolf Apr 07 '20 at 21:28
  • Yes. It lists a few pages of rules, followed by "Evaluation on training data" and attribute usage percentages. –  Apr 08 '20 at 15:19
  • yeah this means the fit was ok..basically if you do summary(model$finalModel) , it should give you something like cat(model$finalModel$output) – StupidWolf Apr 08 '20 at 15:23
  • After 30 seconds, it hasnt' responded. The response to your first suggestion was immediate. –  Apr 08 '20 at 15:29
  • CPU usage on RStudio is running at 17%-20% constantly. Interestingly, Comodo Firewall is running at the same rate, in parallel, and I've never seen that before (with Comodo). I going to look at the firewall logs. –  Apr 08 '20 at 15:33
  • There are no factors with a lot of levels. The target is binary. The inputs are integers and percentages, exclusively. –  Apr 08 '20 at 15:34
  • I don't know what summary(model$finalModel) is supposed to list, so I don't know if cat... is give me all that I need. –  Apr 08 '20 at 15:35
  • All of the Comodo log types were empty - which I have NEVER seen before. I restarted Comodo, and it's started making logs again. Wow. It appears that something in RStudio was hammering the firewall. I have basic RStudio set to allow internet connections, and I don't recall seeing something I denied. –  Apr 08 '20 at 15:38
  • you can refit the model using C5.0, with the optimal paramters. It gives you the same fit. You can run summary on that to see what you need. Unfortunately, that's all i can suggest because I cannot reproduce your error – StupidWolf Apr 08 '20 at 15:39
  • I'll try that. Do you happen to know the setting to just get the output from caret that tells me the ten-fold cross validation, without optimizing C5.0? (I do like the optimization feature.) –  Apr 08 '20 at 15:43

0 Answers0