1

i am trying Maximum Entropy in weka for text classification. I am using Logistic Regression in Weka which is equivalent to Max Entropy. I read that its is computantially expensive. I have current setting of 2G alloted to JVM and i keep word vector dimension to 10, 000 to evaluate Max Entropy, However it always results in JVM out of memory. This makes me think i am making any mistake because 2G heap size is too enough for any classifier, isn't it ?

1) Have anyone used MaxEnt(Logistic.Java) in Weka ? Is it supposed to be so slow for text classification ?

2) Is there any parameter tunning necessary for MaxEnt which i may be ignoring ?

Kashif Khan
  • 301
  • 6
  • 17
  • Are you using Explorer or in code? – NLPer Feb 08 '14 at 22:33
  • @NLPer i am using in code and max dimension upto which i was able to get results was 1000. Any dimension beyond 1000 the JVM eats all heap upto 2GB and ends up in JVM out of heap ... – Kashif Khan Feb 09 '14 at 07:36
  • Ok, how big is your arff and how did you change maxheap? I would try changing it in RunWeka.ini, and start Explorer from RunWeka.bat. For in code use, I would change it in VM arguments from run config in Eclipse. If they don't work you may need more than 2GB.. – NLPer Feb 09 '14 at 22:16
  • 1
    @NLPer yes i have tried all options and came to know that its very computation expensive to use Max Entropy because it calculate a matrix of "F Square" where F is your number of features....! I exclude Max Entropy from my options ... – Kashif Khan Feb 10 '14 at 14:43

0 Answers0