0

Ciao,

Here is my replicating example.

a=c(1,2,3,4,5,6,7,8)
b=c(1,1,0,0,0,"NA",0,"NA")
c=c(11,7,9,9,5,"NA",7,"NA")
d=c(2012,2011,2012,2014,2014,"NA",2011,"NA")
e=c(1,0,1,0,0,1,"NA","NA")
f=c(10,4,11,10,10,6,"NA","NA")
g=c(2014,2012,2010,2012,2013,2011,"NA","NA")
h=c(1,0,1,0,1,0,1,"NA")
i=c(2,12,12,6,8,11,3,"NA")
j=c(2011,2012,2011,2012,2012,2014,2012,"NA")
k=c(1,1,1,0,1,1,1,"NA")
l=c(11/1/2012,"7/1/2012","11/1/2010",0 ,"8/1/2012","6/1/2012","3/1/2012","NA")

mydata = data.frame(a,b,c,d,e,f,g,h,i,j,k,l)
names(mydata) = c("id","test1","month1","year1","test2","month2","year2","test3","month3","year3","anytest","date")

I am aiming to search through each row and find the first test column that is equal to 1. The new column I am aiming to create is "anytest." This column is 1 if test1 or test2 or test3 equals to 1. If none of them do then it equals to 0. This ignores NA values..if test1 and test2 are NA but test3 equals to 0 then anytest equals to 0. Now I have made progress I think using this code:

anytestTRY = if(rowSums(mydata[,c(test1,test2,test3)] == 1, na.rm=TRUE) > 0],1,0)

But now I am at a crossroads because I am aiming to search through each row to find the first column of test1 test2 or test3 that equals to 1 and then report the month and year for that test. So if test1 equals to 0 and test2 equals to NA and test3 equals to 1 I want the column which I created called date to have the month3 and year3 in analyzable time format. Thanks a million.

bvowe
  • 3,004
  • 3
  • 16
  • 33
  • your data.frame is in a "wide" format, I suggest that you combine them into 5 columns to form a "tidy" format: id, test, month, year, test_type, then you can easily filter the data from there. – TC Zhang Aug 20 '18 at 11:40

1 Answers1

0
a=c(1,2,3,4,5,6,7,8)
b=c(1,1,0,0,0,"NA",0,"NA")
c=c(11,7,9,9,5,"NA",7,"NA")
d=c(2012,2011,2012,2014,2014,"NA",2011,"NA")
e=c(1,0,1,0,0,1,"NA","NA")
f=c(10,4,11,10,10,6,"NA","NA")
g=c(2014,2012,2010,2012,2013,2011,"NA","NA")
h=c(1,0,1,0,1,0,1,"NA")
i=c(2,12,12,6,8,11,3,"NA")
j=c(2011,2012,2011,2012,2012,2014,2012,"NA")

mydata = data.frame(a,b,c,d,e,f,g,h,i,j)
names(mydata) = c("id","test1","month1","year1","test2","month2","year2","test3","month3","year3")


library(tidyverse)
library(lubridate)

mydata %>%
  mutate_all(~as.numeric(as.character(.))) %>%  # update columns to numeric
  group_by(id) %>%                              # for each id
  nest() %>%                                    # nest data
  mutate(date = map(data, ~case_when(.$test1==1 ~ ymd(paste0(.$year1,"-",.$month1,"-",1)),                # get date based on first test that is 1
                                     .$test2==1 ~ ymd(paste0(.$year2,"-",.$month2,"-",1)),
                                     .$test3==1 ~ ymd(paste0(.$year3,"-",.$month3,"-",1)))),
         anytest = map(data, ~as.numeric(case_when(sum(c(.$test1, .$test2, .$test3)==1) > 0 ~ "1",        # create anytest column
                                                   sum(is.na(c(.$test1, .$test2, .$test3))) == 3 ~ "NA",
                                                   TRUE ~ "0")))) %>%
  unnest()                                                                                                 # unnestdata

which returns:

# # A tibble: 8 x 12
#      id date     anytest test1 month1 year1 test2 month2 year2 test3 month3 year3
#   <dbl> <date>     <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl>
# 1     1 2012-11-01     1     1     11  2012     1     10  2014     1      2  2011
# 2     2 2011-07-01     1     1      7  2011     0      4  2012     0     12  2012
# 3     3 2010-11-01     1     0      9  2012     1     11  2010     1     12  2011
# 4     4 NA             0     0      9  2014     0     10  2012     0      6  2012
# 5     5 2012-08-01     1     0      5  2014     0     10  2013     1      8  2012
# 6     6 2011-06-01     0    NA     NA    NA     1      6  2011     0     11  2014
# 7     7 2012-03-01     0     0      7  2011    NA     NA    NA     1      3  2012
# 8     8 NA            NA    NA     NA    NA    NA     NA    NA    NA     NA    NA
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • This actually is exactly perfect. Now I realize I did ask my question wrong so do I make a new one? Do I edit this one? – bvowe Aug 21 '18 at 12:34
  • You should create a new one as it will be viewed by more people and it would be more likely to get an answer fast :) – AntoniosK Aug 21 '18 at 12:38
  • OK, very good I'm very sorry I work for school district and they want to report different information – bvowe Aug 21 '18 at 12:44
  • What would be the difference in this output in the new question? Is it about the `date` column and the NAs? – AntoniosK Aug 21 '18 at 12:54