1

I have a liste of french names with some small syntaxic differences.

names <- c("Benoit", "Arnoud (son)", "Arnoud", "Arnous", "Archer, Patrice*", "Archer", "Archer (father)", "André" )

"Arnoud (son)", "Arnoud", "Arnous" all these names belong to the same family. I would like to be able to create a dataframe object to group individuals by family

people1           |people2 |people3  |people4|
"Benoit"          | NA     |NA       |NA
"Arnoud (son)",   |"Arnoud"|"Arnous" | NA
"Archer, Patrice*"|"Archer"| "Archer"|"Archer (father)"
"André"           | NA     | NA      |NA
Wilcar
  • 2,349
  • 2
  • 21
  • 48
  • 2
    You can try playing around with different methods in the `stringdistmatrix()`-function from the `stringdist`-package... and go further from there – Wimpel Feb 10 '19 at 10:26

0 Answers0