I followed the example for cpquery(fitted, event, evidence, cluster, method = "ls", ..., debug = FALSE)
given in the documentation:
#rm(list=ls())
library(bnlearn)
data(learning.test)
fitted = bn.fit(hc(learning.test), learning.test)
# the result should be around 0.025.
cpquery(fitted, (B == "b"), (A == "a"))
# out: 0.02692665
It worked as advertised, but I have some confusion about the syntax. For reference, head(learning.test)
looks like this:
A B C D E F
1 b c b a b b
2 b a c a b b
3 a a a a a a
4 a a a a b b
5 a a b c a a
6 c c a c c a
and the total number of rows is 5000.
Now to my confusion: The second argument to cpquery
, (B == "b")
, is passed to the event
parameter and the third argument (A == "a")
is passed to the evidence
parameter. The documentation says about these two parameters that
The
event
andevidence
arguments must be two expressions describing the event of interest and the conditioning evidence in a format such that, if we denote withdata
the data set the network was learned from,data[evidence, ]
anddata[event, ]
return the correct observations.
I didn't quite understand how the syntax (B == "b")
worked, since I thought B
is not defined in the current scope. So given what was said in the documentation above, I took data
to be learning.test
in my case and then I tried to do learning.test[(B == "b"),]
, which indeed resulted in the error
Error in `[.data.frame`(learning.test, (B == "b"), ) :
object 'B' not found
Given the documentation I thought this line of code should've worked.
I then tried to do learning.test[(learning.test$B=="b"),]
which does seem to return a data frame with all rows where the B
-column has the value "b"
. I then tried to use this syntax in the cpquery
function:
cpquery(fitted, (learning.test$B == "b"), (learning.test$A == "a"))
but to my surprise this resulted in the error
Error in sampling(fitted = fitted, event = event, evidence = evidence, :
logical vector for evidence is of length 5000 instead of 10000.
I don't understand why I get this error (other than that something goes wrong in the logical sampling used by cpquery
), since as far as I can see I am conforming to the quoted passage from the documentation.
So in conclusion I wonder why does the example from the documentation work when it seems to not conform to the quoted passage from the documentation, and why does my modification not work even though it seems to conform to the quoted passage in the documentation? (I am not very familiary with R so maybe this is very basic).