A common idiom (found in books, tutorials, and on many Stack Overflow questions) is to use df
as a sort of throw-away identifier for a dataframe. I've done so hundreds of times with seemingly no ill-effect, but then ran into the following code:
library(tree)
df <- droplevels(iris[1:100,c(1,2,5)])
tr <- tree(Species ~ ., data = df)
plot(tr)
text(tr)
partition.tree(tr)
This gives the following error message:
Error in as.data.frame.default(data, optional = TRUE) :
cannot coerce class ""function"" to a data.frame
I discovered by trial and error that if I simply replace df
above by df2
, the code works as expected. It is true that df
is the name of the density function for the F-distribution, but that doesn't seem to be remotely relevant here. Is this a bug in the tree
package, or is it an important cautionary tale whose moral is that I should avoid using df
as the name for a dataframe since doing so introduces a name-clash?