R - Return column name for row where first given value is found

Question

I am trying to find the first occurrence of a FALSE in a dataframe for each row value. My rows are specific occurrences and the columns are dates. I would like to be able to find the date of first FALSE so that I can use that value to find a return date.

An example structure of my dataframe:

df <- data.frame(ID = c(1,2,3), '2001' = c(TRUE, TRUE, TRUE), 
                 '2002' = c(FALSE, TRUE, FALSE), '2003' = c(TRUE, FALSE, TRUE))

I want to end up with a second dataframe or list that contains the ID and the column name that identifies the first instance of a FALSE.

For example :

ID | Date
1  | 2002
2  | 2003
3  | 2002

I do not know the mechanism to find such a result.

The actual dataframe contains a couple thousand rows so I unfortunately can't do it by hand.

I am a new R user so please don't refrain from suggesting things you might expect a more experienced R user to have already thought about.

Thanks in advance

Duck · Accepted Answer · 2020-11-17T22:35:46.623

Try this using tidyverse functions. You can reshape data to long and then filter for F values. If there are some duplicated rows the second filter can avoid them. Here the code:

library(dplyr)
library(tidyr)
#Code
newdf <- df %>% pivot_longer(-ID) %>%
  group_by(ID) %>%
  filter(value==F) %>%
  filter(!duplicated(value)) %>% select(-value) %>%
  rename(Myname=name)

Output:

# A tibble: 3 x 2
# Groups:   ID [3]
     ID Myname 
  <dbl> <chr>
1     1 2002 
2     2 2003 
3     3 2002

Another option without duplicated values can be using the row_number() to extract the first value (row_number()==1):

library(dplyr)
library(tidyr)
#Code 2
newdf <- df %>% pivot_longer(-ID) %>%
  group_by(ID) %>%
  filter(value==F) %>%
  mutate(V=ifelse(row_number()==1,1,0)) %>%
  filter(V==1) %>%
  select(-c(value,V)) %>% rename(Myname=name)

Output:

# A tibble: 3 x 2
# Groups:   ID [3]
     ID Myname 
  <dbl> <chr>
1     1 2002 
2     2 2003 
3     3 2002

Or using base R with apply() and a generic function:

#Code 3
out <- data.frame(df[,1,drop=F],Res=apply(df[,-1],1,function(x) names(x)[min(which(x==F))]))

Output:

Thank you so much! A true lifesaver. How would I incorporate changing the name of the resulting column in the new dataframe? It defaults to "name". (I know how to change the column name after the fact but I'm curious how you would incorporate it into the code you provided. — JMade, Nov 17 '20 at 22:23
@JMade I have updated the solution. You have to use `rename()` and assign the desired name. Hoping that helps you! — Duck, Nov 17 '20 at 22:36
I posted another question that is a continuation of this that you probably have insight to: [link](https://stackoverflow.com/q/64886149/14266849) @Duck — JMade, Nov 18 '20 at 02:43

score 0 · Answer 2 · answered Nov 18 '20 at 01:01

0

We can use max.col with ties.method = 'first' after inverting the logical values.

cbind(df[1], Date = names(df[-1])[max.col(!df[-1], ties.method = 'first')])
#  ID Date
#1  1 2002
#2  2 2003
#3  3 2002

answered Nov 18 '20 at 01:01

Ronak Shah

377,200
20
156
213

R - Return column name for row where first given value is found

2 Answers2