In the twoway package, I have a twoway.default()
method that takes a matrix or data frame and applies Tukey's methods for the analysis of twoway tables.
Example:
> data(taskRT)
> taskRT
topic1 topic2 topic3 topic4
Easy 2.43 3.12 3.68 4.04
Medium 3.41 3.91 4.07 5.10
Hard 4.21 4.65 5.87 5.69
> twoway(taskRT)
Mean decomposition (Dataset: "taskRT")
Residuals bordered by row effects, column effects, and overall
topic1 topic2 topic3 topic4 roweff
+ --------- --------- --------- --------- + ---------
Easy | -0.055833 0.090833 0.004167 -0.039167 : -0.864167
Medium | 0.119167 0.075833 -0.410833 0.215833 : -0.059167
Hard | -0.063333 -0.166667 0.406667 -0.176667 : 0.923333
+ ......... ......... ......... ......... + .........
coleff | -0.831667 -0.288333 0.358333 0.761667 : 4.181667
I want to extend this with a formula method that takes a data frame and a formula of the form response ~ row + column
, reshapes this from long to wide and then calls the default method. I know several ways to do this directly in the console, but I can't seem to get any of them to work in a formula method function.
Thus, for this data in long format, with the cell value called RT
and the row and column variables as task
and topic
, I'd like to get the same results with a call of
twoway(RT ~ task + topic, data=long)
At top-level, in the console I can do this in various ways, starting from a long
version of the same data.
library(reshape2)
long <- melt(as.matrix(taskRT))
colnames(long) <- c("task", "topic", "RT")
Convert back to wide format, and call twoway()
on that:
# convert wide to long: dcast
(wide <- dcast(long, task ~ topic, value.var="RT"))
twoway(wide[,-1])
# tidyr::spread
library(tidyr)
(wide <- spread(long, key=topic, value=RT))
twoway(wide[,-1])
# base, unstack
wide <- unstack(long, form = RT ~ topic)
rownames(wide) <- unique(long$task)
twoway(wide)
Below is an initial sketch of a twoway.formula
method. The problem I'm having is that I can't figure out how to use the results of parsing the formula object and the associated data frame in the function to construct a call in the function that would result in a wide matrix or data frame suitable for passing to the default method. So far, I've been trying various forms of dcast
within the function, shown as comments, none of which give me joy.
#' Initial sketch for a twoway formula method
#'
#' Doesn't do anything useful yet, but the idea is to be able to use a
#' formula for a twoway table in long form, e.g.,
#' twoway(response ~ row + col, data=mydata)
#'
#' @param formula A formula of the form \code{response ~ rowvar + colVAR}
#' @param data The name of the data set
#' @param subset An expression to subset the data (unused)
#' @param na.action What to do with NAs? (unused)
#' @param ... other arguments, passed down
#' @importFrom stats terms
#'
twoway.formula <- function(formula, data, subset, na.action, ...) {
if (missing(formula) || !inherits(formula, "formula"))
stop("'formula' missing or incorrect")
if (length(formula) != 3L)
stop("'formula' must have both left and right hand sides")
tt <- if (is.data.frame(data))
terms(formula, data = data)
else terms(formula)
if (any(attr(tt, "order") > 1))
stop("interactions are not allowed")
rvar <- attr(terms(formula[-2L]), "term.labels")
lvar <- attr(terms(formula[-3L]), "term.labels")
rhs.has.dot <- any(rvar == ".")
lhs.has.dot <- any(lvar == ".")
if (lhs.has.dot || rhs.has.dot)
stop("'formula' has '.' in left or right hand sides")
m <- match.call(expand.dots = FALSE)
edata <- eval(m$data, parent.frame())
lhs <- formula[[2]]
rhs <- formula[[3]]
# wide <- dcast(data=edata, formula=as.formula(rhs), value.var=lhs )
# wide <- dcast(data=edata, value.var=lhs)
# wide <- dcast(data=edata, rvar[1] ~ rvar[2], value.var=cvar)
# wide <- dcast(data=edata, list(.(rvar[1], .(rvar[2], .(cvar)))))
#browser()
stop("The formula method is not yet implemented.")
# call the default method on the wide data set
twoway(wide)
}
Can anyone help?