1

Due to R release issues I need to switch between qdap::mgsub() and textclean::mgsub(). The functions are almost the same, except for the order of the arguments:

qdap::mgsub(pattern,replacement,x)
textclean::mgsub(x,pattern,replacement)

I have a lot of code where I use qdap::mgsub(). Unfortenately I don't name the arguments properly when I pass them to function. So I need to reorder all of them in order to be able to use textclean::mgsub().

Is there (programmatically) an elegant way to switch between these two functions without having to change the order of the arguments?

rdatasculptor
  • 8,112
  • 14
  • 56
  • 81

3 Answers3

2

Thinking @duckmayr's answer over, I came up with another solution to my question:

First run this function:

reorder_mgsub <- function(pattern,replacement,x){
  output <- textclean::mgsub(x,pattern,replacement)
  return(output)
}

Second: find and replace qdap::mgsub by reorder_mgsub

This solution may be less elegant because I have to do step 2 by hand, but for me it works very well.

rdatasculptor
  • 8,112
  • 14
  • 56
  • 81
1

You can use a regular expression to replace the occurrences in the text of every file you call the old function in, using a function like the following:

replace_mgsub <- function(path) {
    file_text <- readr::read_file(path)
    file_text <- gsub("qdap::mgsub\\(([^, ]+) *, *([^, ]+) *, *([^\\)]) *\\)",
                      "textclean::mgsub\\(\\3, \\1, \\2\\)", file_text)
    readr::write_file(file_text, path)
}

which you would then call on every relevant path (I assume here you know the list of files you need to call the function on; if not, comment below and I can add some stuff on that). Here's a demo of the gsub() part of the function:

file_text <- "qdap::mgsub(pattern,replacement,x)"
cat(gsub("qdap::mgsub\\(([^, ]+) *, *([^, ]+) *, *([^\\)]) *\\)",
         "textclean::mgsub\\(\\3, \\1, \\2\\)", file_text))
#> textclean::mgsub(x, pattern, replacement)
file_text <- "# I'll have in this part some irrelevant code
# to show it won't interfere with that
y = rnorm(1000)
qdap::mgsub(pattern,replacement,x)
z = rnorm(10)
# And also demonstrate multiple occurrences of the function
# as well as illustrate that it doesn't matter if you have spaces
# between comma separated arguments
qdap::mgsub(pattern, replacement, x)"
cat(gsub("qdap::mgsub\\(([^, ]+) *, *([^, ]+) *, *([^\\)]) *\\)",
         "textclean::mgsub\\(\\3, \\1, \\2\\)", file_text))
#> # I'll have in this part some irrelevant code
#> # to show it won't interfere with that
#> y = rnorm(1000)
#> textclean::mgsub(x, pattern, replacement)
#> z = rnorm(10)
#> # And also demonstrate multiple occurrences of the function
#> # as well as illustrate that it doesn't matter if you have spaces
#> # between comma separated arguments
#> textclean::mgsub(x, pattern, replacement)
duckmayr
  • 16,303
  • 3
  • 35
  • 53
  • Nice piece of code! I don't know the list of files, but I guess I can figure that out myself. I was hoping for a solution though that did something like this: `new_mgsub <- reorder(mgsub(),3,2,1)` – rdatasculptor Oct 10 '18 at 10:39
  • @rdatasculptor (1) Yeah, I kind of thought that, but I thought this might actually be a cleaner solution, since even with a solution like that, you'd still have to place that definition at the beginning of every file you're talking about (unless we're dealing with a package here) as well as replace all calls of `library(qdap)` to `library(textclean)` or `qdap::` to (nothing). (2) Is all this code in a package you're building, or just code you have on your machine? (3) What OS are you using (changes advice on identifying the files you need to run the function on)? – duckmayr Oct 10 '18 at 10:51
  • I think your code is an answer to my question, so I will exept it! Thinking your solution over I came up with another simple code. I will add this as an answer as well. – rdatasculptor Oct 10 '18 at 11:03
1

Well, you could also reassign the original function in the package to suit your code.

I.e., using the source code of mgsub,

reorder_mgsub <- function(pattern,replacement,x, leadspace = FALSE, trailspace = FALSE, 
fixed = TRUE, trim = FALSE, order.pattern = fixed, safe = FALSE, 
...){
    if (!is.null(list(...)$ignore.case) & fixed) {
        warning(paste0("`ignore.case = TRUE` can't be used with `fixed = TRUE`.\n", 
            "Do you want to set `fixed = FALSE`?"), call. = FALSE)
    }
    if (safe) {
        return(mgsub_regex_safe(x = x, pattern = pattern, replacement = replacement, 
            ...))
    }
    if (leadspace | trailspace) {
        replacement <- spaste(replacement, trailing = trailspace, 
            leading = leadspace)
    }
    if (fixed && order.pattern) {
        ord <- rev(order(nchar(pattern)))
        pattern <- pattern[ord]
        if (length(replacement) != 1) 
            replacement <- replacement[ord]
    }
    if (length(replacement) == 1) {
        replacement <- rep(replacement, length(pattern))
    }
    if (any(!nzchar(pattern))) {
        good_apples <- which(nzchar(pattern))
        pattern <- pattern[good_apples]
        replacement <- replacement[good_apples]
        warning(paste0("Empty pattern found (i.e., `pattern = \"\"`).\n", 
            "This pattern and replacement have been removed."), 
            call. = FALSE)
    }
    for (i in seq_along(pattern)) {
        x <- gsub(pattern[i], replacement[i], x, fixed = fixed, 
            ...)
    }
    if (trim) {
        x <- gsub("\\s+", " ", gsub("^\\s+|\\s+$", "", x, perl = TRUE), 
            perl = TRUE)
    }
    x
}

Followed by

assignInNamespace('mgsub', reorder_mgsub, 'textclean')

which should assign your updated function to the namespace of textclean package, and whatever code that is using textclean::mgsub will now use your updated function. This way there is no need to change all of the code.

runr
  • 1,142
  • 1
  • 9
  • 25
  • Elegant solution! (I am not sure why yet but I am a little bit reluctant to change package code itself) – rdatasculptor Dec 22 '18 at 07:21
  • @rdatasculptor yup, it can be a useful trick when working with old unmaintained or still under development packages. In a way it's just a fork of the original package, and the function is reassigned only in the current session, so is not permanent. But yes, I'm often reluctant too with similar function overrides and tend to use it when it's not too intrusive (like your case). If you're adding additional code to the function, it can be useful to also wrap it in a ``tryCatch`` with very clear error, i.e., ``error = function(e){print("My custom edit crashed the code!")}`` :) – runr Dec 24 '18 at 23:55