0

I have two data frames I would like to merge

a<- data.frame(x=c(1,4,6,8,1,6,7,2),ID=c("132","14.","732","2..","132","14.","732","2.."),year=c(1,1,1,1,2,2,2,2))

b<- data.frame(y=c(2,7,5,5,1,1,2,3),ID=c("132","144","732","290","132","144","732","290"),year=c(1,1,1,1,2,2,2,2))

The ID Variable by which I would like to merge the two data frames is not completely known in data set a. I also want to merge by year. They are known up to a fully identifying regular expression. Note, there is a one-to-one match. In this example you would not find an ID "1.." in the data set so there are no ambiguous matches.

I would like to get something like this:

output<-data.frame(y=c(2,7,5,5,1,1,2,3),x=c(1,4,6,8,1,6,7,2),ID=c("132","144","732","290","132","144","732","290"), year=c(1,1,1,1,2,2,2,2))

I tried to remove the regular expression part with substr and then use starts_with in the merge but it does not work.

I get the following error message

Coercing pattern to a plain character vector

when I do this:

df_complete <- regex_inner_join(b,a, by=c("ID","year"))

Thank you stack overflow...

mclofa
  • 33
  • 5

2 Answers2

1

Answer from Comment by @jblood94

With a and b as data.tables: a[, regex_inner_join(b[year == .BY], .SD, by = "ID"), year] – jblood94

Jan
  • 4,974
  • 3
  • 26
  • 43
mclofa
  • 33
  • 5
0

You might want to use the fuzzyjoin package, and then you can use the regex_inner_join() directly:

fuzzyjoin::regex_inner_join(b,a, by="ID") %>% select(x,y,ID=ID.x)

Output:

  x y  ID
1 1 2 132
2 4 7 144
3 6 5 732
4 8 5 290
langtang
  • 22,248
  • 1
  • 12
  • 27
  • Dear Langtang. I sadly simplified the problem too much. There is a second variable by which I merge (year) and that does not allow the merge : Coercing `pattern` to a plain character vector when I do this: df_complete <- regex_inner_join(b,a, by=c("ID","year")) – mclofa Jun 09 '22 at 21:11
  • 2
    @mclofa With `a` and `b` as `data.table`s: `a[, regex_inner_join(b[year == .BY], .SD, by = "ID"), year]` – jblood94 Jun 09 '22 at 21:29
  • nice, @jblood94!! – langtang Jun 09 '22 at 21:39