I am trying to parsing a website wiht HtmlUnit and Jsoup and i facing this problem. I have different pages to parse and I stored this links of this pages in a string array. I want to loop on array's length and parse each page and i proceed in this way.
1) For loop on the length of link's array 2) Opening new webclient 3) Creating new HtmlPage from link with getPage method 4) Parsing and getting some elements 5) Closing webclient 6) go back to 2).
In this way, i'm obtaining what I want, but code it's little bit slow. So i tried to open and close webClient outside the for loop. Like this:
1) Opening new webclient 2) For loop on the length of link's array 3) Creating new HtmlPage from link with getPage method 4) Parsing and getting some elements 5) go back to 2). 6) Closing webclient
It's much more faster but i'm not obtaining same results of previous way.
Is it wrong to use webclient constructor in this way?
EDIT: Following the code I'm testing:
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
// TODO Auto-generated method stub
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
String[] links = {"http://www.oddsportal.com/tennis/china/atp-beijing/murray-andy-dimitrov-grigor-fTdGYm3q/#cs;2;6",
"http://www.oddsportal.com/tennis/china/atp-beijing/murray-andy-dimitrov-grigor-fTdGYm3q/#cs;2;9"};
String bm = null;
String[] odds = new String[2];
//Second way
WebClient webClient = new WebClient(BrowserVersion.CHROME);
System.out.println("Client opened");
for (int i=0; i<links.length; i++) {
HtmlPage page = webClient.getPage(links[i]);
System.out.println("Page loaded");
Document csDoc = Jsoup.parse(page.asXml());
System.out.println("Page parsed");
Element table = csDoc.select("table.table-main.detail-odds.sortable").first();
Elements cols = table.select("td:eq(0)");
if (cols.first().text().trim().contains("bet365.it")) {
bm = cols.first().text().trim();
odds[i]=table.select("tbody > tr.lo").select("td.right.odds").first().text().trim();
}
else {
Elements footTable = csDoc.select("table.table-main.detail-odds.sortable");
Elements footRow = footTable.select("tfoot > tr.aver");
odds[i] = footRow.select("td.right").text().trim();
bm = "AVG";
}
webClient.close();
}
System.out.println(bm +"\t" +odds[0] + "\t" + odds[1]);
}
If i run this code results are right. If i move webClient.close(); outside the for loop results are not correct. In particular odds[0] is equal to odds[1];