0

I have a .csv dataset I'm trying to run a C5.0 decision tree on. The dataset is a fake-news dataset modified to exclude any special characters. This is my source code:

data <- read.csv("C:/Users/Admin/downloads/fnn_test.csv", stringsAsFactors = TRUE)
data <-na.omit(data)
data <- data[-1]
data <- data[-1]
str(data)

#assigning the diagnosis as factors
data$LBL <- factor(data$LBL, levels = c('fake', 'real'),
                               labels = c("Fake", "Real"))

#dividing dataset
data_train <- data[1:50, 1:5]
data_test <- data[51:70, 1:5]

str(data_train)

#defining seed and library
set.seed(123)
library(C50)

#
predModel <- C5.0(LBL ~ ., data_train)
summary(predModel)

The error I get when I run the decision tree is "c50 code called exit with value 1", and in the summary it says "line 17 of `undefined.names': overlength name: check data file formats". I replaced all special characters in the dataset as it gave me different errors and this is what I'm left with.

This is the download for the dataset: https://drive.google.com/file/d/112X0cnV7lwkUh8JawPD50_iJ58FJa-OH/view?usp=sharing

This is how the dataset looks like

  • Have you looked at the previous question [C5.0 decision tree - c50 code called exit with value 1](https://stackoverflow.com/q/22803310/4752675) ? – G5W Jan 02 '21 at 16:16
  • The security on your share link is not set to public. – Ian Campbell Jan 02 '21 at 16:26

0 Answers0