library(tidyverse)
data <- tibble(city =c('Montreal','Montréal','Ottawa','Ottawa','New York','Newyork','New-York'),
value = 1:7)
data%>%
group_by(city)%>%
summarise(mean = mean(value))
and I'd like to obtain something like that but unfortunately it creates 6 groups when in fact there are 3 cities. I have a far larger data set and I was wondering how we could use fuzzy string matching to find the solution. Is there a way to automate this because my data has thousands of observations...