Can I perform a linear regression with a character vector? If not how do I change it?

Question

I have a dataset where I wanna see whether certain survey scores predict academic performance. The problem is the academic performance is a character vector since it has percentages such as “71-80%” or “Less than 40%” for grades, so it is characterized as a chr. But my lm() function is not working since the Academic.Performance is the DV.

I used the code:

mymodel=lm(Academic.Performance ~ x1 + x2 + x3, data = data)

It then produced an error saying: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, …): NA/NaN/Inf in ‘y’ In addition: Warning message: In storage.mode(v) <- “double” : NAs introduced by coercion.

I tried to use this code but it still gave me the same message:

data[is.na(data) | data == “Inf”] = NA data[is.na(data) | data == “NaN”] = NA

score 0 · Answer 1 · answered Feb 24 '23 at 20:48

You should probably do ordinal regression, e.g.:

## make sure to substitute the correct ordering of levels in ...,
##  e.g. c("less than 40%", "50%", "60%", ">70%")
data$ordperf <- ordered(data$Academic.Performance, levels = c(...))
MASS::polr(ordperf ~ x1 + x2 + x3, data = data)

(polr stands for proportional odds logistic regression, one particular form of ordinal regression ...)

Can I perform a linear regression with a character vector? If not how do I change it?

1 Answers1