2

I have two datasets similar to the one below (but with 4m observations) and I want to count the number of matching sample days between the two data frames (see example below).

DF1                   
ID   date             
1    1992-10-15       
1    2010-02-17
2    2019-09-17
2    2015-08-18
3    2020-10-27
3    2020-12-23

DF2 
ID   date
1    1992-10-15
1    2001-04-25
1    2010-02-17
3    1990-06-22
3    2014-08-18
3    2020-10-27

Expected output 
ID   Count
1    2
2    0
3    1

I have tried the aggregate function (though unsure what to put in "which":

test <- aggregate(date~ID, rbind(DF1, DF2), length(which(exact?)))

and the table function:

Y<-table(DF1$ID)
    X <- table(DF2$ID)
    Y2 <- DF1[Y %in% X,]

I am having trouble finding an example to help my situation.

Your help is appreciated!

Tanya
  • 21
  • 2

3 Answers3

1

in Base R

data.frame(table(factor(merge(df1,df2)$ID, unique(df1$ID))))
  Var1 Freq
1    1    2
2    2    0
3    3    1
Onyambu
  • 67,392
  • 3
  • 24
  • 53
0

Using tidyverse

library(dplyr)
library(tidyr)
inner_join(df1, df2) %>% 
  complete(ID = unique(df1$ID)) %>%
  reframe(Freq = sum(!is.na(date)), .by = "ID")

-output

# A tibble: 3 × 2
     ID  Freq
  <int> <int>
1     1     2
2     2     0
3     3     1
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Here is one way to do it with 'dplyr' and 'tidyr':

library(dplyr)
library(tidyr)

DF1 %>% 
  semi_join(DF2) %>% 
  count(ID) %>% 
  complete(ID = DF1$ID,
           fill = list(n = 0))

#> Joining with `by = join_by(ID, date)`
#> # A tibble: 3 × 2
#>      ID     n
#>   <dbl> <int>
#> 1     1     2
#> 2     2     0
#> 3     3     1

data

DF1 <- tibble(ID = c(1,1,2,2,3,3),
              date = c("1992-10-15", "2010-02-17", "2019-09-17",
                       "2015-08-18", "2020-10-27", "2020-12-23"))

DF2 <- tibble(ID = c(1,1,1,3,3,3),
              date = c("1992-10-15", "2001-04-25", "2010-02-17",
                       "1990-06-22", "2014-08-18", "2020-10-27"))

Created on 2023-02-16 with reprex v2.0.2

TimTeaFan
  • 17,549
  • 4
  • 18
  • 39