I'm using HtmlUnit
to parse html with js code. The structure of the page is(using Chrome Developer Tools):
And my code is as follow:
WebClient wc=new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
wc.getOptions().setUseInsecureSSL(true);
wc.getOptions().setJavaScriptEnabled(true);
wc.getOptions().setCssEnabled(false);
wc.getOptions().setThrowExceptionOnScriptError(false);
wc.getOptions().setTimeout(10000);
wc.getOptions().setDoNotTrackEnabled(false);
HtmlPage page= wc.getPage(address);
List<HtmlDivision> items=(List<HtmlDivision>)page.getByXPath(
"/html/body/div[@id='wrapper']/div[@class='content_main']/div[@class='search_result']/div[@id='resultData']");
System.out.println(items);
if(items!=null && items.size()>0){
HtmlDivision resultMain=items.get(0);
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(@class,'search_one')]");
System.out.println(appDivList);
for(HtmlDivision resultItem:appDivList){
try{
DomElement appImgInfo=resultItem.getFirstElementChild();
List<HtmlDivision> appInfoList=(List<HtmlDivision>)resultItem.getByXPath("./div[@class='one_right']");
String appName=null;
The problem is when i debug this code, it works fine. When i run this code,
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(@class,'search_one')]");
doesn't work,that is appDivList
is empty, but when i debug this code, appDivList
is not empty.
Anyone know why?
Update:
I add some Thread.sleep
code before
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(@class,'search_one')]");
The updated code is:
HtmlDivision resultMain=items.get(0);
try{
Thread.sleep(10000);
}catch(Exception e){}
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(@class,'search_one')]");
System.out.println(appDivList);
It works! How does this happen?