To answer your question on why is it slow:
This is purely because HTMLUnit has many things going against it:
- It is running in a compiled language which does not have many of the native optimisations of browsers such as FireFox.
- It requires well formed XML as opposed to HTML(non-strict) which means that it has to convert the HTML into XML.
- Then it has to run the JavaScript through a parser, fix any problems with the code, then process that inside Java itself.
- Also as @Arya pointed out, it requests things one at a time, so many javascript files will result in a slow down, many images will result in a slow down.
To answer your question on how to speed it up:
As a general rule I disable(unless they are needed):
- JavaScript
- Images
- CSS
- Applets.
I also got the source code and removed the ActiveX support and re-compiled. If you want to prevent the code from loading those extra pages you can use the code below to give a response without downloading it from the web.
WebClient browser;
browser.setWebConnection(new WebConnectionWrapper(browser) {
@Override
public WebResponse getResponse(final WebRequest request) throws IOException {
if (/* Perform a test here */) {
return super.getResponse(request); // Pass the responsibility up.
} else {
/* Give the program a response, but leave it empty. */
return new StringWebResponse("", request.getUrl());
}
}
});
Other things I have noticed:
- HTMLUnit is not thread safe meaning that you should probably create a new one for each thread.
- HTMLUnit does actually cache the pages