16

Hello fellow Number crunchers

As the headline suggests, I am looking for a library for learning and inference of Bayesian Networks. I have already found some, but I am hoping for a recommendation.

Requirements in a quick overview:

  • preferably written in Java or Python
  • configuration (also of the network itself) is a) possible and b) possible via code (and not solely via a GUI).
  • source code available
  • project is still maintained
  • the more powerful, the better

Which one do you recommend ?

steffen
  • 2,152
  • 4
  • 19
  • 30
  • I am the OP and I have voted to delete this question, since it is also my personal goal to keep the SE-sites clean. – steffen Aug 20 '12 at 12:55

3 Answers3

11

Have a look at Weka. It's kind of popular in my neck of the woods... It's open source and written in Java.

This will tell you about bayesian networks in Weka, from the abstract:

  • Structure learning of Bayesian networks using various hill climbing (K2, B, etc) and general purpose (simulated annealing, tabu search) algorithms.
  • Local score metrics implemented; Bayes, BDe, MDL, entropy, AIC.
  • Global score metrics implemented; leave one out cv, k-fold cv and cumulative cv.
  • Conditional independence based causal recovery algorithm available.
  • Parameter estimation using direct estimates and Bayesian model averaging.
  • GUI for easy inspection of Bayesian networks.
Dr G
  • 3,987
  • 2
  • 19
  • 25
  • Weka is indeed powerful, however it is not clear how complicated it is to define a network structure instead of learning it. The manual indicates that it is possible, though. (+1) for the link to the bn manual. (off-topic: Interesting in participating in stats.stackexchange ? We need more machine learners over there ;)) – steffen Dec 28 '10 at 07:48
  • I found the Weka learning curve rather steep, but I think it's worth it. I am indeed on stats.stackexchange, just not terribly active :) – Dr G Dec 28 '10 at 14:01
8

So here I give my subjective answer.

From my experience everything related to statistics is best solved with R. I have seen this often that in fields related to statistics, R has the most libraries and very often the most state-of-the-art algorithms/methods implemented.

Most programmers like me like to stay with the languages that they know, and learning something new is a trade off, mainly because it's time consuming.

So if learning a new language is a viable option, R is a good choice, in my opinion the best.

Take a brief look at the R libraries related to Bayesian Networks and Bayesian Interference.

Baysian: http://cran.r-project.org/web/views/Bayesian.html

Graphical Models: http://cran.r-project.org/web/views/gR.html

Machine Learning: http://cran.r-project.org/web/views/MachineLearning.html

The main advantages of R:
- easy to install a library: install.packages("RWeka")
- the help format and style is the same for all libraries
- if you know R, it's easy to switch from one library to the next. So it's easy to test all available libraries and then use the one that fits best

mrsteve
  • 4,082
  • 1
  • 26
  • 63
  • Thank you for the reply. I actually know R, but sometimes it is not only what the best language is, but also the question whether to introduce yet another language into an existing environment (which includes that more than one has to be trained with the language). I had my reasons for asking for Java, but nevertheless thank you the links. Did not know them. – steffen Mar 31 '11 at 06:48
2

Never used it, but perhaps the MALLET library fits the bill?

Stompchicken
  • 15,833
  • 1
  • 33
  • 38
  • (+1): Didn't find that one. It looks promising. – steffen Dec 22 '10 at 14:41
  • Mallet itself is very popular for NLP stuff. I haven't used it but it seems solid, actively developed and is quite well documented. The graphical models stuff seems a bit 'bolted on' to the main package though. – Stompchicken Dec 22 '10 at 15:20
  • @StompChicken How can we create Bayesian Network using mallet? – Siten Feb 25 '14 at 09:16