I am plotting continuous data (y-variable) based on several categorical variables (factors, x-variables) using boxplot and stripchart. For this purpose the default plotting functions provide a handy formula-based interface, where I can input data as: Response ~ Factor1 + Factor2 + ... and obtain combinations of Factor 1, Factor 2 etc as x-axis coordinates.
However, I am struggling to find out what these raw coordinate values are for my data, since I want to annotate some values in my plots.
Example:
data(iris)
iris[,"DummyFactor"] <- as.factor(c("First", "Second"))
boxplot(Sepal.Length ~ Species + DummyFactor, data = iris)
stripchart(Sepal.Length ~ Species + DummyFactor, data = iris, vertical=T, add=T, pch=16)
# y-axis values:
ys <- iris[,"Sepal.Length"]
# x-axis:
# How to obtain the x-axis values on my current plot?
Experimentally I found out that the x-values in this example are:
xs <- apply(model.matrix(~ -1 + Species + DummyFactor, data = iris), MARGIN=1, FUN=function(x) sum(c(1,2,3,3)[as.logical(x)]))
# Annotate a few examples, e.g. 7th, 100th and 120th observation
points(x=xs[c(7,100,120)], y=ys[c(7,100,120)], pch=16, col="red", cex=2)
iris[c(7,100,120),]
#> iris[c(7,100,120),]
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species DummyFactor
#7 4.6 3.4 1.4 0.3 setosa First
#100 5.7 2.8 4.1 1.3 versicolor Second
#120 6.0 2.2 5.0 1.5 virginica Second
... which works but seems hardly the correct way to approach this. Seems the formula-implementations of boxplot and stripchart are hidden from the user.
Is there an easy way to obtain these coordinates in a general case?