I have a data.frame
with 203 rows and 111 numeric variables. I first import my database. At this stage there is no problem, I get what is expected when I filter my data (i.e. 154 observations for Q5==1 and 11 obervations for Q6_6==1):
> cred.res = read_excel("C:/Users/whatever")
> filter(cred.res, Q6_6==1)
# A tibble: 11 x 111
> filter(cred.res, Q5==1)
# A tibble: 154 x 111
Then, as I want to implement a MCA, I need to convert some of my variables into factor:
cred.res = data.frame(apply(cred.res,2, as.factor))
num.col <- c("Q10_sum", "Q11_sum", "Q12_sum", "NB_DE_BI","NB_DE_B1", "POURC", "POP","E101", "TOTAL_EM", "E110", "ENFANTS_", "DEPENSES", "F314", "F501", "TOTAL_DE","Nbpartenaire")
cred.res[, num.col] = apply(cred.res[, num.col], 2, as.numeric)
That's where the trouble begins. Indeed, I still get my 154 observations when I filter on Q5 but it doesn't work for Q6_6 (I get 0 obs):
> de=filter(cred.res, Q5==1)
> str(de)
'data.frame': 154 obs. of 111 variables:
> se=filter(cred.res, Q6_6==1)
> str(se)
'data.frame': 0 obs. of 111 variables:
I tried to use the function as.numeric
, but it still doesn't work, I now get 38 obs:
> ze=filter(cred.res, as.numeric(Q6_6)==1)
> str(ze)
'data.frame': 38 obs. of 111 variables:
But it works with the operator > 1
:
> qe=filter(cred.res, as.numeric(Q6_6)>1)
> str(qe)
'data.frame': 11 obs. of 111 variables:
It seems that converting my variables into factors has changed the values. Can someone explains how does it happen ? Should I always apply filters before converting the variables ?
Hope I was understandable, I'm not a native English speaker. Thank you !