4

I know that HtmlUnit simulates a browser, while HttpClient doesn't.

In HtmlUnit, when a page is loaded and there is a JavaScript inside, will the script be executed? If the script sets a cookie, will the cookie set in HtmlUnit's browser and accessible from Java code?

Is there anything that can be done using HttpClient, but not using HtmlUnit? In HtmlUnit, can we start with a POST request and modify any part of HTTP request including method, URI, HTTP version, headers, and body?

What are the advantages of HttpClient over HtmlUnit?

Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56
Wong mendem
  • 141
  • 2
  • 8

1 Answers1

5

HttpClient is a library at a lower-level, to send HTTP requests and retrieve responses.

HtmlUnit is at a higher level, and internally uses HttpClient to make HTTP requests, but also handles JavaScript (through Rhino and internal DOM implementation), XPath (through Xalan), CSS (through CSSParser), malformed HTML (through NekoHtml), WebSockets (through Jetty), etc.

You can modify the outgoing requests and response in HtmlUnit by something like:

new WebConnectionWrapper(webClient) {

    public WebResponse getResponse(WebRequest request) throws IOException {
        WebResponse response = super.getResponse(request);
        if (request.getUrl().toExternalForm().contains("my_url")) {
            String content = response.getContentAsString("UTF-8");

            //change content

            WebResponseData data = new WebResponseData(content.getBytes("UTF-8"),
                    response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
            response = new WebResponse(data, request, response.getLoadTime());
        }
        return response;
    }
};

as hinted here.

You can change the used HttpClient in HtmlUnit by overriding HttpWebConnection.createHttpClient().

You can make POST request by:

WebRequest webRequest = new WebRequest(url, HttpMethod.POST);
HtmlPage page = webClient.getPage(webRequest);
Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56