I am new to linear regression. I am running a simple linear regression using two variables with lm. The issue is that I generating different results. I have done the coding twice to see if the model's output it the same. It isn't, suggesting I have made a mistake in one of the attempts.
The output in my first attempt shows 12 elements in the data overview in environment tab. The output in my second attempt shows 14 elements in the data overview in environment tab.
How do I know which output is the right one? Is it the first attempt as the DV values are 1-7 whereas in the second attempt the DV also includes values for -1 changing i.e. wrong as this cannot be included for an interval level variable?
How do I go about identifying the mistake? I saw a difference in elements in the data overview and started by looking at the differences. Yet, I can't see anything but am guessing that it is to do with the values I mention above of -1 and -999. Is this a good place to start? Are there better, other ways?
Many thanks for helping me understand!
Here is the code for my first attempt:
reg <-lm(immig.view~edu.degree.level,df1)
> reg
Call:
lm(formula = immig.view ~ edu.degree.level, data = df1)
Coefficients:
(Intercept) edu.degree.level
4.1734 0.3464
> dput(head(df1,10))
structure(list(edu.degree.level = c(1L, 0L, 1L, 1L, 0L, 1L, 1L,
1L, 1L, 0L), immig.view = structure(c(7, 4, 5, 1, 7, 5, 7, 1,
3, 1), label = "J1 Do you think immigration is good or bad for Britain's economy?", labels = c(`Not stated` = -999,
`Don`t know` = -1, `1 Bad for economy` = 1, `2` = 2, `3` = 3,
`4` = 4, `5` = 5, `6` = 6, `7 Good for economy` = 7), class = "haven_labelled")), row.names = c(NA,
10L), class = "data.frame")
Here is the code for my second attempt:
> reg <-lm(immig.view~edu.degree.level,df1)
> reg
Call:
lm(formula = immig.view ~ edu.degree.level, data = df1)
Coefficients:
(Intercept) edu.degree.levelwithoutdegree
4.5198 -0.3431
> dput(head(df1,10))
structure(list(edu.degree.level = structure(c(1L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 2L), .Label = c("withdegree", "withoutdegree"
), class = "factor"), immig.view = structure(c(7, 4, 5, 1, 7,
5, 7, 1, 3, 1), label = "J1 Do you think immigration is good or bad for Britain's economy?", labels = c(`Not stated` = -999,
`Don`t know` = -1, `1 Bad for economy` = 1, `2` = 2, `3` = 3,
`4` = 4, `5` = 5, `6` = 6, `7 Good for economy` = 7), class = "haven_labelled")), row.names = c(NA,
10L), class = "data.frame")
Thanks again.