6

I need to perform some inferences on a Bayesian network, such as the example I have created below. Bayesian network

I was looking at doing something like something like this to solve an inference such as P(F| A = True, B = True). My initial approach was to do something like

For every possible output of F
  For every state of each observed variable (A,B)
     For every unobserved variable (C, D, E, G)
        // Calculate Probability

But I don't think this will work because we actually need to go over many variables at once, not each at a time.

I have heard about Pearls algorithm for message passing but am yet to find a reasonable description that isn't extremely dense. For added information, these Bayesian networks are constrained as to not have more than 15-20 nodes, and we have all the conditional probability tables, the code doesn't really have to be fast or efficient.

Basically I am looking for a way to do this, not necessarily the BEST way to do this.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
suphug22
  • 181
  • 1
  • 8
  • Is your graph just an example, or are all top variables observed? – Willem Van Onsem May 26 '15 at 02:19
  • Pearl's message passing algorithm only applies to networks without loops. There are exact algorithms for loopy networks of discrete and Gaussian variables, but they are not simple. My advice is to find some software to do the calculations so all you have to do is enter the network description (variables, connections, and probability tables) and run the queries. There are both commercial and non-commercial software for this; sorry, I don't have a recommendation. – Robert Dodier May 26 '15 at 02:20
  • the graph was just an example, the top variables are not always strictly observed – suphug22 May 26 '15 at 02:29
  • If it's a BN then I assume there are no loops. Correct? – Andrzej Pronobis May 26 '15 at 02:36
  • yes this assumption is correct – suphug22 May 26 '15 at 03:03
  • 1
    @Andrzej "If it's a BN then I assume there are no loops." -- I don't understand what you're trying to say. A BN cannot have directed cycles, but it can be multiply connected, which if I'm not mistaken is what people mean when they say "loops". The example given by OP is loopy, for example. – Robert Dodier May 26 '15 at 22:20
  • Indeed, I was a bit quick with that response and the word loops was not the most fortunate. My intention was to verify that the OP has a correct BN with no **cycles**, either directed or undirected (valid in e.g. chain graphs). And you are right about the fact that having a single-connected graph makes exact belief propagation much easier without e.g. junction trees. – Andrzej Pronobis May 26 '15 at 22:56

1 Answers1

0

Your Bayesian Network (BN) does not seem to be particularly complex. I think you should easily get away with using exact inference method, such as junction tree algorithm. Of course, you can still just do brute force enumeration, but that would be a waste of CPU resources given that there are so many nice libraries out there that implement smarter ways of doing both exact and approximate inference in graphical models.

Since your tag mentions C++, my recommendation would be libDAI. It is a well written library that implements multiple exact and approximate inference on generic factor graphs. It does not have any weird dependencies and is very easy to integrate into your project. It is particularly well suited for discrete cases, such as yours, for which you have the probability tables.

Now, you noticed that I mentioned factor graphs. If you are not familiar with the concept, I will refer you to Wikipedia article on factor graphs as well as What are "Factor Graphs" and what are they useful for?. The principle is very simple, you represent your BN as a factor graph and then libDAI will do the inference for you.

EDIT:

Since CPU resources do not seem to be a problem for you and simplicity is the key, you can always go with brute force enumeration. The idea is straightforward.

Your Bayesian Network represents a joint probability distribution, which you can write down in terms of an equation, e.g.

P(A,B,C) = P(A|B,C) * P(B|C) * P(C) 

Assuming that you have tables for all your conditional probability distributions, i.e. P(A|B, C) P(B|C) and P(C) then you can simply go over all the possible values of variables A, B, and C and calculate the output.

Richard Chambers
  • 16,643
  • 4
  • 81
  • 106
Andrzej Pronobis
  • 33,828
  • 17
  • 76
  • 92
  • Thanks for the help, I am looking to not use external libraries, and the networks are quite simple, I was wondering if you could elaborate what you mean when you say exact inference methods, I am still very new to the topic. edit - I'm not worried about it being a waste of CPU resources as this is not part of a larger program, just the program itself and most of the nodes will only take on 2-3 variables i.e. true,false,maybe – suphug22 May 26 '15 at 02:33
  • Well, if you are looking for something very very simple, you can simply do a brute force enumeration that you suggested yourself. Do that and come back here if it takes too much time :) – Andrzej Pronobis May 26 '15 at 02:37