-1

I want to do preprocessing of docs(wsdl files) using mallet in Eclipse. I want to generate feature vectors and perform classification using mallet and MaxEntropy. I am new in using mallet, Can anyone guide me in this regard.

Thanks

sid
  • 9
  • 5

1 Answers1

1

If you're referring to Web Services Description Language, I don't know of any specific workflows or packages designed for those documents. I suspect that you might want to create a set of features that combines text (from web service descriptions) and more "categorical" features, like URLs or URL patterns.

The way I would approach this problem is to create a separate package that reads WSDL files and writes out a file in a format that Mallet expects. This adapter could be written in whatever language you are most comfortable with. It would read all the files, get a parsed XML tree for each, extract text and certain other features, and output a file in Mallet's preferred tab-delimited, one-doc-per-line format.

David Mimno
  • 1,836
  • 7
  • 7
  • Exception in thread "main" java.lang.NoClassDefFoundError: org.apache.commons.logging.LogFactory at com.predic8.soamodel.AbstractParser.class$(AbstractParser.groovy) at com.predic8.soamodel.AbstractParser.$get$$class$org$apache$commons$logging$LogFactory(AbstractParser.groovy) at com.predic8.soamodel.AbstractParser.(AbstractParser.groovy:25) at com.predic8.wsdl.WSDLParser.(WSDLParser.groovy) at parsing.main(parsing.java:16) – sid Jan 10 '17 at 17:26
  • Thanks for ur answer, now i m doing wsdl parsing using membrae soa in java eclipse but i m getting the above exception, – sid Jan 10 '17 at 17:30
  • Thanks for your answer!, I used this approach, Now i want to test my data using a classifier. I am using command but it is giving error. Exception in thread "main" java.lang.IllegalArgumentException: Problem loading classifier from file C:\mallet-2.0.8\training2.mallet: cc.mallet.types.InstanceList cannot be cast to cc.mallet.classify.Classifier at cc.mallet.classify.tui.Text2Classify.main(Text2Classify.java:79) – sid Sep 07 '17 at 09:54
  • My command is C:\mallet-2.0.8>bin\mallet classify-dir --input C:\ShowData_24Aug\Parsed_Comp --output - --classifier C:\mallet-2.0.8\training2.mallet – sid Sep 07 '17 at 09:55