0

I am trying to put a categorical variable as a response variable to glm function. So I initially did this:

logreg_ <- glm(Genre ~ price, data = train)
msummary(logreg_)

However, it just have given this as a result and I don't know how to fix this.

Error in y - mu : non-numeric argument to binary operator

The glimpse() of these columns are like this:

$ Genre            <chr> "Strategy", "Strategy", "Early Access", "Early Access",~
$ price            <dbl> 0.00, 0.79, 3.99, 11.39, 5.59, 0.79, 10.99, 5.79, 1.69,~

What should I do?

Lyn
  • 103
  • 2
  • It looks like you need to switch Genre and price. The formula specification is `glm( y ~ x, data)` So try `logreg_ <- glm(price ~ Genre , data = train)` – Brian Fisher Apr 29 '21 at 23:07
  • Would it still do the same thing as a result? – Lyn Apr 29 '21 at 23:12
  • No, sorry I misread your question, that would predict price based on Genre. To predict a categorical variable based on a numeric, you probably need to go to other methods. The first one that occurs is to use a logistic regression, which can predict true/false values. You would want to recode your data so you had a column for each of your categories, then use `glm(category ~ price, data = train, family = "binomial")`. Probably need to verify that this is appropriate for your case. – Brian Fisher Apr 29 '21 at 23:56
  • 1
    You probably need **multinomial regression**, see questions on this site (e.g. `nnet::multinom()` – Ben Bolker Apr 30 '21 at 00:53

1 Answers1

0

As said by @Brian , for logistic regression, you need to specify glm(...,family="binomial), second the outcome must be a factor. A toy example:

set.seed(4)
df <- data.frame(y =sample(letters[1:4],100,replace = T),x=runif(100),stringAsFactor=T)
str(df)
logreg<- glm(y ~ x, data = df,family = "binomial")
summary(logreg)
Elia
  • 2,210
  • 1
  • 6
  • 18
  • I have tried family="binomial" and did as.factor(train$Genre) before running the regression, but now it gives the error [Error in eval(family$initialize) : y values must be 0 <= y <= 1] – Lyn Apr 30 '21 at 23:36
  • I hope you did `train$Genre <- as.factor(train$Genre)` or at least `glm(as.factor(Genre)~.,data=train, familiy="binomial")` If this doesn't work for you (which would be really odd) try to recode your outcome in 0-1 like `train$Genre <- ifelse(train$Genre=="strategy",1,0)` if strategy is your reference. See also [here](https://stackoverflow.com/questions/53942910/logistic-regression-evalfamilyinitialize-y-values-must-be-0-y-1) and [here](https://stackoverflow.com/questions/47546658/logistic-regression-on-factor-error-in-evalfamilyinitialize-y-values-must) for solutions of your identical problem – Elia May 01 '21 at 08:21