I am a complete beginner in R/R Studio, coding and statistics in general.
In R, I am running a GLM where my Y variable is a no/yes (0/1) category and my X variable is a Sex category (female/male).
So I have run the following script:
hello <- read.csv(file.choose())
hello$sexbin <- ifelse(hello$Sex == 'm',0,ifelse(hello$Sex == 'f',1,NA))
modifhello <- subset(hello,hello$Combi_cag_long>=36)
model1 <- glm(modifhello$VAB~modifhello$Sex, family=binomial(link=logit),
na.action=na.exclude, data=modifhello)
summary.lm(model1)
However, in my output, R seems to have split male/female as two separate variables, suggesting that it is not treating them as proper binary variables:
Coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.689 1.009 -3.656 0.000258 ***
modifhello$Sexf 2.506 1.010 2.482 0.013084 *
modifhello$Sexm 2.922 1.010 2.894 0.003820 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
What do I need to add to my script to correct this?
FOUND THE SOLUTION
Need to simply put modifhello$VAB~modifhello$sexbin not modifhello$VAB~modifhello$sex (as this is the old column).