1

i was trying to predict price for used car data in r. i have done all the preprocessing and divided the data into training and test set. here i am using regression tree. when i was trying to get accuracy i got this error.

library(rpart)
library(tidyverse)
library(dplyr)
dput(head(train.df, 5))

reg_tree <- rpart(price ~ ., 
            data = train.df,
            method = "anova", minbucket = 1, maxdepth = 30, cp = 0.001)

accuracy(predict(reg_tree, train.df), train.df$price)
structure(list(price = 33990, year = 2018L, manufacturer = structure(1L, .Label = c("acura", 
"alfa-romeo", "aston-martin", "audi", "bmw", "buick", "cadillac", 
"chevrolet", "chrysler", "datsun", "dodge", "ferrari", "fiat", 
"ford", "gmc", "harley-davidson", "honda", "hyundai", "infiniti", 
"jaguar", "jeep", "kia", "land rover", "lexus", "lincoln", "mazda", 
"mercedes-benz", "mercury", "mini", "mitsubishi", "nissan", "pontiac", 
"porsche", "ram", "rover", "saturn", "subaru", "tesla", "toyota", 
"volkswagen", "volvo"), class = "factor"), condition = structure(4L, .Label = c("excellent", 
"fair", "good", "like new", "new", "salvage"), class = "factor"), 
    cylinders = structure(6L, .Label = c("10 cylinders", "12 cylinders", 
    "3 cylinders", "4 cylinders", "5 cylinders", "6 cylinders", 
    "8 cylinders", "other"), class = "factor"), fuel = structure(3L, .Label = c("diesel", 
    "electric", "gas", "hybrid", "other"), class = "factor"), 
    odometer = 22267, title_status = structure(1L, .Label = c("clean", 
    "lien", "missing", "parts only", "rebuilt", "salvage"), class = "factor"), 
    transmission = structure(1L, .Label = c("automatic", "manual", 
    "other"), class = "factor"), drive = structure(2L, .Label = c("4wd", 
    "fwd", "rwd"), class = "factor"), size = structure(3L, .Label = c("compact", 
    "full-size", "mid-size", "sub-compact"), class = "factor"), 
    type = structure(4L, .Label = c("bus", "convertible", "coupe", 
    "hatchback", "mini-van", "offroad", "other", "pickup", "sedan", 
    "SUV", "truck", "van", "wagon"), class = "factor"), paint_color = structure(10L, .Label = c("black", 
    "blue", "brown", "custom", "green", "grey", "orange", "purple", 
    "red", "silver", "white", "yellow"), class = "factor")), row.names = 31113L, class = "data.frame")


Error in UseMethod("accuracy") : 
  no applicable method for 'accuracy' applied to an object of class "c('double', 'numeric')"

could anyone please help me in this.

Thanks in advance.

rishi
  • 13
  • 1
  • 6
  • Can you post sample data? Please edit **the question** with the output of `dput(train.df)`. Or, if it is too big with the output of `dput(head(train.df, 20))`. And please load the packages you are using with calls to `library()` in the beginning of the script. – Rui Barradas Apr 27 '21 at 19:53

2 Answers2

0

Assuming that I don't know which package the accuracy function comes from (maybe MLmetrics::Accuracy??), however, the error is due to the metric you use with respect to the type of the problem: accuracy is used in a classification problem, where the outcome is typically a factor that can only have certain values (discrete variable). Here you are fitting a regression model, due to the class of your outcome (numeric). The car's price can vary continuously within a certain range. So, for the regression problem, one of the most used metrics to evaluate the model performance is the root mean square error (RMSE). The RMSE function is implemented in the caret package. Here I post an example with the in-built dataset cars.test.frame from the package `rpart:

library(rpart)
library(tidyverse)
library(dplyr)
library(caret)

data("car.test.frame")
ind <- createDataPartition(car.test.frame$Price,p=.8,list=F)
train.df <- car.test.frame[ind,]
test.df <- car.test.frame[-ind,]

reg_tree <- rpart(Price ~ ., 
                  data = train.df,
                  method = "anova", minbucket = 1, maxdepth = 30, cp = 0.001)



rmse <- RMSE(predict(reg_tree, test.df), test.df$Price)
rmse_perc <- rmse/mean(test.df$Price)*100

the RMSE can be reported as a percentage of the average car price. You can also implement your own rmse function thanks to its ease of calculation:

rmse <- function (y_pred, y_true) 
{
  RMSE <- sqrt(mean((y_true - y_pred)^2))
  return(RMSE)
}

However, the rmse function above is identical to the RMSE of the caret package

Elia
  • 2,210
  • 1
  • 6
  • 18
-2

just use

library(forecast)

and it will work!