4

I have this dataframe:

df <- structure(list(A = 1:5, B = c(1L, 5L, 2L, 3L, 3L)), 
                class = "data.frame", row.names = c(NA, -5L))

  A B
1 1 1
2 2 5
3 3 2
4 4 3
5 5 3

I would like to get this result:

  A B Result
1 1 1      B
2 2 5   <NA>
3 3 2   <NA>
4 4 3   <NA>
5 5 3      B

Strategy:

  1. Check if A==B then assign B to new column Result if not NA.
  2. But do this also for all PREVIOUS rows of B.

AIM:

I want to learn how to check if a certain value of column A say in row 5 is in the previous rows of column B (eg. row 1-4).

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
TarJae
  • 72,363
  • 6
  • 19
  • 66

4 Answers4

4

I hope the following code fits your general cases

transform(
  df,
  Result = replace(rep(NA, length(B)), match(A, B) <= seq_along(A), "B")
)

which gives

  A B Result
1 1 1      B
2 2 5   <NA>
3 3 2   <NA>
4 4 3   <NA>
5 5 3      B
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
3

Here is a dplyr::rowwise approach:

library(dplyr)

df %>%
  rowwise %>% 
  mutate(result = ifelse(A %in% .[seq(cur_group_rows()),]$B, "B", NA))

#> # A tibble: 5 x 3
#> # Rowwise: 
#>       A     B result
#>   <int> <int> <chr> 
#> 1     1     1 B     
#> 2     2     5 <NA>  
#> 3     3     2 <NA>  
#> 4     4     3 <NA>  
#> 5     5     3 B

Created on 2021-08-26 by the reprex package (v0.3.0)

TimTeaFan
  • 17,549
  • 4
  • 18
  • 39
2

Just some minor changes to @ThomasIsCoding's answer to make it dplyr. Slightly more laid-out to be easier to read, in my opinion.

library(tidyverse)
df <- structure(list(A = 1:5, B = c(1L, 5L, 2L, 3L, 3L)), 
                class = "data.frame", row.names = c(NA, -5L))
match(df$A, df$B)
#> [1]  1  3  4 NA  2
df %>% mutate(Result = if_else(match(A, B) <= row_number(), 
                               "B", 
                               NA_character_))
#>   A B Result
#> 1 1 1      B
#> 2 2 5   <NA>
#> 3 3 2   <NA>
#> 4 4 3   <NA>
#> 5 5 3      B

Created on 2021-08-26 by the reprex package (v1.0.0)

Arthur Yip
  • 5,810
  • 2
  • 31
  • 50
2

We can use

library(dplyr)
library(purrr)
df %>% 
   mutate(Result = map_chr(row_number(), ~ case_when(A[.x] %in% B[seq(.x)]~ "B")))

-output

 A B Result
1 1 1      B
2 2 5   <NA>
3 3 2   <NA>
4 4 3   <NA>
5 5 3      B
akrun
  • 874,273
  • 37
  • 540
  • 662