I have a dt = data.table
with a character column.
I need to perform multiple regex operations on that column, which I have written as:
dt[, Description := sapply(Description, tolower)][
, Description := sapply(Description, gsub, pattern = " $", replacement = "")][
, Description := sapply(Description, gsub, pattern = " ", replacement = " ")][
, Description := sapply(Description, gsub, pattern = "ões\\>", replacement = "ão")][
, Description := sapply(Description, gsub, pattern = "eis\\>", replacement = "el")][
, Description := sapply(Description, gsub, pattern = "as\\>", replacement = "a")][
, Description := sapply(Description, gsub, pattern = "ais\\>", replacement = "al")][
, Description := sapply(Description, gsub, pattern = "es\\>", replacement = "e")][
, Description := sapply(Description, gsub, pattern = "ns\\>", replacement = "m")][
, Description := sapply(Description, gsub, pattern = "s\\>", replacement = "")]
These are basically all ways of changing plural to singular in Portuguese.
Is there a more efficient and elegant way of doing this?