The TermRaider system isn't a single PR, it's a whole application (in fact a Groovy ScriptableController
). The TermraiderEnglish
Resource is just a hook to make that application appear in the "ready-made applications" menu of the GATE Developer GUI.
In embedded code you can load the application using the PersistenceManager
File termRaiderPlugin = new File(Gate.getPluginsHome(), "TermRaider");
File gappFile = new File(new File(termRaiderPlugin, "applications"),
"termraider-eng.gapp");
CorpusController trApp = (CorpusController)PersistenceManager.loadObjectFromFile(
gappFile);
When you run the application over a corpus, it creates new instances of three "termbank" LRs containing the information about the newly discovered terms. The vanilla application is really intended for GUI rather than embedded use so it doesn't store references to these new LRs anywhere useful - you'll have to interrogate the CreoleRegister
to find them. You might prefer to make your own copy of the application and tweak the control script to store the termbank instances as (say) features on the Corpus
, by adding something like
corpus.features.tfidfTermbank = termbank0
corpus.features.annotationTermbank = termbank1
corpus.features.hyponymyTermbank = termbank2
to the end of the control script. You could then access them in your Java code via corpus.getFeatures().get("tfidfTermbank")
etc.
Since these Termbank classes are themselves part of the TermRaider
plugin, you'll probably want to add gate-termraider.jar
to your main application classpath rather than loading it via the GateClassLoader
.