I have a .csv dataset I'm trying to run a C5.0 decision tree on. The dataset is a fake-news dataset modified to exclude any special characters. This is my source code:
data <- read.csv("C:/Users/Admin/downloads/fnn_test.csv", stringsAsFactors = TRUE)
data <-na.omit(data)
data <- data[-1]
data <- data[-1]
str(data)
#assigning the diagnosis as factors
data$LBL <- factor(data$LBL, levels = c('fake', 'real'),
labels = c("Fake", "Real"))
#dividing dataset
data_train <- data[1:50, 1:5]
data_test <- data[51:70, 1:5]
str(data_train)
#defining seed and library
set.seed(123)
library(C50)
#
predModel <- C5.0(LBL ~ ., data_train)
summary(predModel)
The error I get when I run the decision tree is "c50 code called exit with value 1", and in the summary it says "line 17 of `undefined.names': overlength name: check data file formats". I replaced all special characters in the dataset as it gave me different errors and this is what I'm left with.
This is the download for the dataset: https://drive.google.com/file/d/112X0cnV7lwkUh8JawPD50_iJ58FJa-OH/view?usp=sharing