I have data.table 1 like:
entidad municipio col1
AGUASCALIENTES Aguascalientes 5000
PUEBLA Jes�s Mar�a 3000
VALLE Calvillo 2000
JALISCO Pabell�n de Arteaga 1000
and data.table 2 like :
entidad municipio col2
AGUASCALIENTES Aguascalientes 16369
PUEBLA Jesus Maria 1687
VALLE Calvillo 916
JALISCO Pabellon de Arteaga 774
and I would like to perform a merge using the columns entidad
and municipio
like merge(dt1, dt2, by = c('entidad', 'municipio'))
using a partial match on both columns or just in column municipio
. However the problem is that data.table 1 has accents and the strange symbol �
and thus the municipio
column might differ on 3 different cases 1) by accent, 2) by symbol �
or slightly different name (ej 'ciudad juarez' in dt1 and 'Juárez' in dt2). Since I am not very familiar with working with strings/character columns I was wondering if someone could help me in figuring out how to perform a merge in this case? the result that I would like to obtain is the following, from the previous example:
entidad municipio col1 col2
AGUASCALIENTES Aguascalientes 5000 16369
PUEBLA Jesus Maria 3000 1687
VALLE Calvillo 2000 916
JALISCO Pabellon de Arteaga 1000 774