1

I'd like to parse an R expression into a list and optionally modify aspects of it before finally turning it into a json object. As an example, I'm trying to get to something like:

{"op": "=",
      "content": {
          "lhs": "gender",
          "rhs": ["male"]
      }
}

I would be starting with an R expression like:

gender == "male"

I can use pryr::ast to get a text version of the tree, but I'd like to get this as a list like:

op: "=="
  [[1]]: "gender"
  [[2]]: "male"

The details of the "format" of the list are not so important, just to be clear. I am simply aiming to get a computable and modifiable parse tree of R expressions.

seandavi
  • 2,818
  • 4
  • 25
  • 52
  • I'm thinking the material in this Q&A would be useful: http://stackoverflow.com/questions/33419325/plotting-expression-trees-in-r although it is not focussed on modification of expressions. – IRTFM Jul 10 '16 at 16:06
  • Notice that the pryr functions distinguish true R `names` (aka `symbols`) from mere character values by prepending with backticks. Also note that `quote` is not really included in the lables but that `()` is really substituting for it. The open parenthesis is an important marker to the R parser and is a function in it own right: `is.function(`\`(`\`)` #[1] TRUE . – IRTFM Jul 10 '16 at 18:34

2 Answers2

5

Is something like this what you're looking for?

expr <- quote(gender == "male")

expr[[1]]
# `==`
expr[[2]]
# gender
expr[[3]]
# "male"
expr[[3]] <- "female"
expr
# gender == "female"
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • That is generally it. I have some more complicated expressions, so I'll need to learn how to work with the nested lists returned by quote, but I think the approach will work. – seandavi Jul 10 '16 at 12:53
2

Here's an approach for the output portion of your request, using modifications of the methods cited in my comment. This is based on Hadley's pkg:pryr. See ?Ops for the list of infix operators. I've seen functions lhs and rhs defined ... IIRC in Hadley's Advanced Programming text. Obviously the only functions labeled as 'ops' will be the infix math and logical, but a more complete labeling of Math(), Complex(), and Summary() functions could be done with the other lists in the ?groupGeneric page:

call_tree2(quote(gender == "male")) # relabeling of items in pryr-functions
#--------
 - call:
   - `op: ==
   - `gender
   -  "male" 

Functions defined below:

library(pryr) # also loads the stringr namespace
# although the `tree` function is not exported, you can see it with:
pryr:::tree   # now for some hacking and adding of logic
tree2<-
function (x, level = 1, width = getOption("width"), branch = " - ") 
{
    indent <- str_c(str_dup("  ", level - 1), branch)
    if (is.atomic(x) && length(x) == 1) {
        label <- paste0(" ", deparse(x)[1])
        children <- NULL
    }
    else if (is.name(x)) {
        x <- as.character(x)
        if (x == "") {
            label <- "`MISSING"
        }
        if (x %in% c("+", "-", "*", "/", "^", "%%", "%/%",
"&", "|", "!","==", "!=", "<", "<=", ">=", ">") ) {
             label <- paste0("`op: ", as.character(x))}
        else {
            label <- paste0("`", as.character(x))
        }
        children <- NULL
    }
    else if (is.call(x)) {
        label <- "call:"
        children <- vapply(as.list(x), tree2, character(1), level = level + 
            1, width = width - 3)
    }
    else if (is.pairlist(x)) {
        label <- "[]"
        branches <- paste("", format(names(x)), "=")
        children <- character(length(x))
        for (i in seq_along(x)) {
            children[i] <- tree2(x[[i]], level = level + 1, width = width - 
                3, branch = branches[i])
        }
    }
    else {
        if (inherits(x, "srcref")) {
            label <- "<srcref>"
        }
        else {
            label <- paste0("", typeof(x), "")
        }
        children <- NULL
    }
    label <- str_trunc(label, width - 3)
    if (is.null(children)) {
        paste0(indent, label)
    }
    else {
        paste0(indent, label, "\n", paste0(children, collapse = "\n"))
    }
}
environment(tree2)<-environment(pryr:::tree)

And now call it with call_tree2:

pryr::call_tree
call_tree2 <- 
function (x, width = getOption("width")) 
{
    if (is.expression(x) || is.list(x)) {
        trees <- vapply(x, tree2, character(1), width = width)
        out <- str_c(trees, collapse = "\n\n")
    }
    else {
        out <- tree2(x, width = width)
    }
    cat(out, "\n")
}
environment(call_tree2)<-environment(pryr::call_tree)
IRTFM
  • 258,963
  • 21
  • 364
  • 487