0

I'm planning to use the Java Weka library's EM algorithm in order to assign probabilities to objects to be in a certain cluster and then, work with these probabilities.

Furthermore, the properties of those objects will be loaded from a database, so I would like to load them into the clusterer directly from memory, instead of dumping them to an arff file as in the examples I have found around the web (e.g. Serialization).

Firstly, I would like to know if the Weka library is the proper one for my purpose of there exists another one such as Apache Commons Math.

Secondly, is there any example which does not manage any file in order to create Instances?

I would be grateful for any help.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
aloplop85
  • 892
  • 3
  • 16
  • 40
  • 1
    You have to convert your objects into Instance object. If you have valid Instance objects you can employ weka for your task – user Aug 19 '15 at 17:29
  • I saw this one, which may help: https://ianma.wordpress.com/2010/01/16/weka-with-java-eclipse-getting-started/ – aloplop85 Aug 19 '15 at 17:35
  • 1
    Yes you can do it like this to construct your instances but why not creating an arff and read all the data in a twoliner ;) – user Aug 19 '15 at 17:39
  • 1
    ELKI has a rather extensible architecture. I haven't played with their EM much, but it may be easier to extend than Weka's. – Has QUIT--Anony-Mousse Aug 19 '15 at 19:40
  • Thanks you very much. Another option should be to try R with Java using JRI. For the moment, I think I will continue the Weka way... – aloplop85 Aug 19 '15 at 20:31
  • When considering using R you should consider to create an arff. R is able to read arffs too. – user Aug 19 '15 at 20:55

0 Answers0