I'm having trouble finding out if values partially match between 2 columns in different tables so i could update values accordingly. The problem is trying to match a longer value with smaller value and having multiple prefix values. I don't want to use any loops as my dataset is in the 1m range.
df1:
PRODUCT_8 | VAT |
---|---|
72093234 | 0,2 |
30023000 | 0,2 |
12093236 | 0,2 |
30066012 | 0,2 |
df2:
KN |
---|
30022000 |
30023000 |
300660 |
72 |
Desired output:
PRODUCT_8 | VAT |
---|---|
72093234 | 0,9 |
30023000 | 0,9 |
12093236 | 0,2 |
30066012 | 0,9 |
I have tried this:
df1[is.na(pmatch(df1$PRODUCT_8, df2$KN, duplicates.ok = TRUE)) == FALSE,
"VAT"] <- 0.09
but pmatch
doesn't understand if I want to match longer numbers with smaller. It would be perfect if I could just use startswith(df1$PRODUCT_8, df2$KN)
, but sadly you can't have multiple values in the prefix argument. sqldf is not complying with my demands either, don't know what's the problem with that.