I'm new at R and I need to prepare a column of names and then impute sex, but I'm having some problems with the preparation of the strings, specifically this is an example of what I have:
Name example:
"alberto eduardo etchegaray de la cerda ."
What I need to do is eliminate all the "de" "del" "lo" "los" "la" "las" "double white spaces" "end of string white spaces" and everything that is interfering with the names.
My code so far to clean the string is (in a second line i will eliminate the spaces):
str_replace_all('alberto eduardo etchegaray de la cerda',
'\\bdel*\\b|\\blos*\\b|\\blas*\\b|.$',
replacement=" ")
and the result:
"alberto eduardo etchegaray cerd "
The problem is that I'm getting some words cut when i need them complete.