I have a dataset blah
with a column kw
. There are tens of thousands of strings, some of which are sentence-length. I already replaced the vast majority of what I want to replace with a for
loop, replacing substrings with substring categories. However, I cannot possibly think of all the substrings that need replacing--while most of the heavy lifting is done, there are just a good amount of edge cases and I want to handle them as they arise.
I want to create a function cleanup
where I can pass it an oldsubstring and a newsubstring and the function will replace instance of oldsubstring in blah$kw
with newsubstring.
Here's what I've written so far:
cleanup <- function(oldstring,
newstring) {
blah$kw[grepl(oldstring,
blah$kw)] <- sapply(blah$kw[grepl(oldstring,
blah$kw)],
function(x) gsub(oldstring,
newstring,
x))
}
This may look stupid, I have no idea--I'm quite new to R. But I am basing it off of the one-off code I found, which is here:
blah$kw[grepl(oldstring,
blah$kw)] <- sapply(blah$kw[grepl("oldstring",
blah$kw)],
function(x) gsub("oldstring",
"newstring",
x))
}
And which works just like a charm. Anyway, any help would be huge. Thanks!