I have a XML config (ScreenScraper) that does what I want correctly in the executable version of WebHarvest. I am confused on how to execute it through Java.
Asked
Active
Viewed 550 times
1 Answers
1
All you need is import some classes from library:
import org.webharvest.definition.ScraperConfiguration;
import org.webharvest.runtime.Scraper;
import org.webharvest.runtime.variables.Variable;
create object ScraperConfiguration with your config.xml file:
ScraperConfiguration config = null;
try {
config = new ScraperConfiguration("/path/to/config.xml");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
create object Scraper with path to working dir:
Scraper scraper = new Scraper(config, "/tmp/");
and execute configuration:
scraper.execute();
You can also access variables after configuration execution:
String stringVar =
((Variable)scraper.getContext().getVar("my_string_var")).toString();
List<Variable> listVar =
((Variable) scraper.getContext().getVar("my_list_var")).toList();

Chemik
- 1,459
- 13
- 23