Questions tagged [htmlunit]

HtmlUnit is a "headless browser". Which means that there is no browser GUI and it does no rendering. Though it has a CSS and JS engine to simulate a real browser. Primary purpose is testing and information extraction.

HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.

It has fairly good JavaScript support (which is constantly improving) and is able to work even with quite complex AJAX libraries, simulating Chrome, Firefox or Internet Explorer depending on the configuration used.

It is typically used for testing purposes or to retrieve information from web sites.

HtmlUnit is not a generic unit testing framework. It is specifically a way to simulate a browser for testing purposes and is intended to be used within another testing framework such as JUnit or TestNG.

HtmlUnit is used as the underlying "browser" by different Open Source tools like Canoo WebTest, JWebUnit, Selenium WebDriver, JSFUnit, Celerity, ...

HtmlUnit was originally written by Mike Bowler of Gargoyle Software and is released under the Apache 2 license.

Useful links

1835 questions
4
votes
1 answer

HtmlUnit and Fragment Identities

I'm currently wondering how to deal with fragment identities, a link that I am wanting to grab information from, contains a fragment identity. It seems as if HtmlUnit is discarding the "#/db4mj" of my url and therefore loading the original url. Does…
StartingGroovy
  • 2,802
  • 9
  • 47
  • 66
4
votes
1 answer

HtmlUnit get element by class name containing string

I want to find any elements in the HtmlPage that have a class that contains the word 'date'. ie i want to match any of the following:
August 13 2017
August 12 2017
spark problems
  • 55
  • 1
  • 1
  • 8
4
votes
2 answers

HtmlUnit & GWT error

I've a GWT application that I try to index. I am using HtmlUnit to get the content of the generated HTML: WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6); HtmlPage refDesing =…
Muhammad Hewedy
  • 29,102
  • 44
  • 127
  • 219
4
votes
2 answers

HtmlUnit accessing an element without id or Name

How can I access this element: More info: I have to access programmatically a web page and simulate clicking on a button on it, which then will generate a xml file which I hope…
Nikola Kolev
  • 1,239
  • 3
  • 17
  • 25
4
votes
1 answer

HttpClient vs HtmlUnit

I know that HtmlUnit simulates a browser, while HttpClient doesn't. In HtmlUnit, when a page is loaded and there is a JavaScript inside, will the script be executed? If the script sets a cookie, will the cookie set in HtmlUnit's browser and…
Wong mendem
  • 141
  • 2
  • 8
4
votes
2 answers

java.lang.OutOfMemoryError: Java heap space with Htmlunit use

I am trying to scrap some websites by using htmlunit 2.16. Websites content are bit heavy and having pages around 5000. I am getting Java heap space issue after some page being scrapped. I have allocated -Xms1500m and -Xmx3000m. But after running…
Sthita
  • 1,750
  • 2
  • 19
  • 38
4
votes
1 answer

How to get the element by class? htmlunit

I am using htmlunit to get the webpage data. And I want to get the data with the
. But I can't find the method that is find by the class. How do I get the data? Here is the webpage source:
Capslock10
  • 796
  • 2
  • 16
  • 37
4
votes
1 answer

Read alerts in HTML unit

I am working on HTML unit with Java to read HTML pages. I have a scenario where I have to read messages from the popup/ alert window. I have an index page page = form.getInputByName("index").click(); After I click on the index page I get the…
Shaik Mujahid Ali
  • 2,308
  • 7
  • 26
  • 40
4
votes
2 answers

using HtmlUnit behind proxy

I'm trying to use HtmlUnit behind a proxy : public class App { public static void main(String[] args) throws Exception { System.setProperty("http.proxyHost", "172.23.232.10"); System.setProperty("http.proxyPort", "8080"); final…
K.Ariche
  • 1,188
  • 3
  • 14
  • 29
4
votes
1 answer

Loading javascript assets from integration tests (Play/Selenium)

I'm attempting to test our play 2.4.x application that makes heavy use of react for rendering tables and similar things. When just running the application normally, all the javascript gets processed and output properly. From our integration test…
4
votes
3 answers

How to send a picture as part of a multipart POST request - Java HtmlUnit

I am trying to use Java to submit a captcha to decaptcher.com. Decaptcher doesn't really do a good job of explaining how to use their API's, so I am trying to figure out how to use an HTTP POST request to submit a captcha. Here is the example code I…
Dylan
  • 949
  • 3
  • 13
  • 23
4
votes
3 answers

How can I test context menu functionality in a web app?

I'm playing with a grails app that has a contextmenu (on right-click). The context menu is built using Chris Domigan's jquery contextmenu plugin. While the contextmenus do actually work, I want to have automated tests, and I can't work out how to do…
John
  • 6,701
  • 3
  • 34
  • 56
4
votes
2 answers

Ajax Crawling on Google App Engine - Does HtmlUnit work?

http://code.google.com/web/ajaxcrawling/docs/html-snapshot.html Does HtmlUnit work on AppEngine? If not, are there any other ways to make my GWT app crawlable by search engines?
Matthew H
  • 5,831
  • 8
  • 47
  • 82
4
votes
3 answers

Accessing webpage with Cloudflare protection

First of I wanted to apologize in case my question may not be provided with enough connect or anything of that matter, I'm typing this up on my phone right now. So I'm working on a project that requires me to automate tasks within a webpage and in…
SirRan
  • 59
  • 1
  • 1
  • 8
4
votes
1 answer

Javascript based dynamic content using htmlUnit

I have been stuck in getting JavaScript based dynamic content using HtmlUnit. I am expecting to get (Signin, Registration html content) from the page. With the following code, I only get the static content. I am new to HtmlUnit. Any help will be…
Irshad
  • 63
  • 4