0

I'm trying to run a linear regression model in R and I want to use na.rm to remove all my missing data. Is it possible to use na.rm here? If so where in the function do I put it? If not, what argument should I use instead?

My dataset is called dta and my variables are standardized. I've been trying to put na.rm after the object, but I honestly have no idea where it should be going or if it will work at all for this.

ex:

Model1 <- lm(dta$CTSO(dta$Year == 1997) ~ scale(dta$VM(dta$Year == 1997)) + scale(dta$AVMT(dta$Year == 1997)) + scale(dta$DDF1(dta$Year == 1997), na.rm = TRUE)
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 4
    Check your options with `options("na.action")`. If this returns `"na.omit"` then your default behavior for `lm()` and other functions that rely on that option setting is to omit `NA`s by default. Btw. `lm()` doesn't come with an `na.rm` argument. The argument is `na.action`. – TimTeaFan Apr 27 '23 at 16:51
  • You can access the `lm` functions help page with `?lm`. In the **Usage** and **Arguments** section you can see the names of the arguments to see that `na.rm` isn't there, but `na.action` is, and get a description of the choices for `na.action`. – Gregor Thomas Apr 27 '23 at 17:01
  • `Model1 <- lm(CTSO~ scale(VM) + scale(AVMT) + scale(DDF1), data = dta, subset = Year == 1997)` should be the model you want. Although you should consider scaling before running the model – Onyambu Apr 27 '23 at 17:05
  • @rawr why would you regress `Year` against the variables? Op's response is `CTSO` and not `Year` – Onyambu Apr 27 '23 at 17:07
  • I am afraid if the `na.rm` argument is applicable in the `lm()` function in R. instead, you can use the `na.omit()` function to remove all missing values, if it is permitted (i.e. if you are not going to loose more than 5% of the data, then only use it). something like: `dta_clean <- na.omit(dta[, c("CTSO", "VM", "AVMT", "DDF1")])` – Manoj Kumar Apr 27 '23 at 18:00

0 Answers0