I have a data frame which contains several numeric variables. I have written a sorting algorithm that sorts the rows by comparing the values in the columns containing the numeric values I'm interested in.
The values are YYYYMMDD in numeric format. However, some entries have 0 (zeros) as a value where it really should be an NA. This means that a comparison is possible between for instance 20001224 and 0 even though it does not make sense as the 0 is a not-applicable value.
I could turn the values into dates using strptime, thus getting rid of the non-dates. However, in an attempt to understand how I can recode several columns of a data frame into NA values, I wanted to post it as a question here.
There must be an easy way (using one of the apply functions) to go column by column and recode all the 0's (zeros) into NAs.
EnrollmentBegin EnrollmentBegin2 EnrollmentBegin3 EnrollmentEnd EnrollmentEnd2 EnrollmentEnd3
20040129 20130107 0 20060526 20140816 0
20050829 0 0 20070822 0 0
20000831 0 0 20020524 0 0
20080827 0 0 20090526 0 0
Here is the dput of an excerpt of my data:
structure(list(EnrollmentBegin = c(20040129, 20050829, 20000831, 20080827), EnrollmentBegin2 = c(20130107, 0, 0, 0), EnrollmentBegin3 = c(0, 0, 0, 0), EnrollmentEnd = c(20060526, 20070822, 20020524, 20090526 ), EnrollmentEnd2 = c(20140816, 0, 0, 0), EnrollmentEnd3 = c(0, 0, 0, 0)), .Names = c("EnrollmentBegin", "EnrollmentBegin2", "EnrollmentBegin3", "EnrollmentEnd", "EnrollmentEnd2", "EnrollmentEnd3"), row.names = c("3", "5", "6", "7"), class = "data.frame")