1

I have the same problem as described in the question Call getPage from htmlunit WebClient with JavaScript disabled and setTimeout set to 10000 waits forever.

There is only one relevant (complicated) possible answer there (by theytoo). So I was wondering if:

  1. Does someone have a simpler answer?
  2. Can someone verify the solution works?
Community
  • 1
  • 1
Al Renaud
  • 71
  • 1
  • 7
  • Maybe it'd be a good idea to provide the most simple code that leads to this exception not being thrown and also the HtmlUnit version you're using. – Mosty Mostacho Jan 31 '12 at 21:31
  • Ya, I've got one too (a ). We are at HtmlUnit 2.9 How about: – Al Renaud Jan 31 '12 at 22:40
  • Ya, I've got one too (a moustache). We are at HtmlUnit 2.9 How about: webClient = new WebClient(); webClient().setTimeout(180000); page=webClient.getPage("myurl"); in a big try-catch... – Al Renaud Jan 31 '12 at 22:52

1 Answers1

3

Code I used:

package main;

import java.io.IOException;
import java.net.MalformedURLException;

import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;

public class Test {

    public static void main(final String[] args) {
        final WebClient webClient = new WebClient();
        webClient.setTimeout(1000);
        try {
            System.out.println("Querying");
            webClient.getPage("http://www.google.com");
            System.out.println("Success");
        } catch (final FailingHttpStatusCodeException e) {
            System.out.println("One");
            e.printStackTrace();
        } catch (final MalformedURLException e) {
            System.out.println("Two");
            e.printStackTrace();
        } catch (final IOException e) {
            System.out.println("Three");
            e.printStackTrace();
        } catch (final Exception e) {
            System.out.println("Four");
            e.printStackTrace();
        }
        System.out.println("Finished");
    }

}

Output (removed all CSS and JS warnings):

Querying
Success
Finished

After changing timeout from 1000 to 1 (I won't hit google in less than 1 ms):

Querying
Three
org.apache.http.conn.ConnectTimeoutException: Connect to www.google.com:80 timed out
    at com.gargoylesoftware.htmlunit.SocksSocketFactory.connectSocket(SocksSocketFactory.java:92)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
    at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
    at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:573)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:425)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776)
    at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:152)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358)
    at main.Test.main(Test.java:17)
Finished

Conclusion: I can't reproduce it and it works as expected

Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123
  • Ya, well, good effort I guess, but if my question was that easy, I would not have put it on StackOverflow. Based on the and also based on the doc at http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/conn/DefaultClientConnectionOperator.html – Al Renaud Feb 01 '12 at 17:55
  • ... Based on the aformentioned other StackOverflow question and also based on the doc at http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/conn/DefaultClientConnectionOperator.html I will try 2 things (they are config parameters so that's easy): 1) reduce drastically the timeout from 3 minutes to "some"(?) seconds and 2) put the IP address flat out in the URL instead of the site domain. Thanks for your ideas; will keep you informed, A.R. – Al Renaud Feb 01 '12 at 18:02
  • What version are you using? I am using htmlunit 2.20 and there's no `setTimeout` method in WebClient. – Mateus Viccari Jun 17 '16 at 20:11