11

I'm trying to use REST Assured to check some properties on an HTML document returned by my server. An SSCCE demonstrating the problem would be as follows:

import static com.jayway.restassured.path.xml.config.XmlPathConfig.xmlPathConfig;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

import org.junit.Test;

import com.jayway.restassured.path.xml.XmlPath;

public class HtmlDocumentTest {

  @Test
  public void titleShouldBeHelloWorld() {
    final XmlPath xml = new XmlPath("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
      + "<html xmlns=\"http://www.w3.org/1999/xhtml\">"
      + "<head><title>Hello world</title></head><body></body></html>")
      .using(xmlPathConfig().with().feature("http://apache.org/xml/features/disallow-doctype-decl", false));
    assertThat(xml.get("//title[text()]"), is("Hello world"));
  }
}

Now, this attempt ends in com.jayway.restassured.path.xml.exception.XmlPathException: Failed to parse the XML document caused by, off all the possible errors, java.net.ConnectException: Connection timed out after some 30 seconds or so!

If I remove the line with the xmlPathConfig().with().feature(...) the test fails immediately due to DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true..

If I remove the doctype line from the document the parsing succeeds but the test fails on an assertion error, "Expected: is "Hello world" but: was <Hello worldnull>" -- however, that's a different problem, obviously (but feel free to give instructions on that one, too...). And removing the doctype isn't an option for me anyway.

So, question: how do you check properties of an HTML document with a doctype using REST Assured? It says in the documentation that "REST Assured providers predefined parsers for e.g. HTML, XML and JSON.", but I cannot seem to find any examples on how exactly to activate and work with that HTML parser! There's no "HtmlPath" class like there's XmlPath, for example, and that timeout exception is very puzzling...

ZeroOne
  • 3,041
  • 3
  • 31
  • 52

3 Answers3

13

I checked your code. The thing is that XmlPath of Restassured isn't Xpath, but uses a property access syntax. If you add a body content to your sample HTML you will see that your XPath doesn't do much. The actual name of the query language is GPath. The following example works, note also the use of CompatibilityMode.HTML, which has the right config for you need:

import static org.junit.Assert.assertEquals;
import org.junit.Test;
import com.jayway.restassured.path.xml.XmlPath;
import com.jayway.restassured.path.xml.XmlPath.CompatibilityMode;

public class HtmlDocumentTest {

    @Test
    public void titleShouldBeHelloWorld() {
        XmlPath doc = new XmlPath(
                CompatibilityMode.HTML,
                "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
                        + "<html xmlns=\"http://www.w3.org/1999/xhtml\">"
                        + "<head><title>Hello world</title></head>"
                        + "<body>some body"
                        + "<div class=\"content\">wrapped</div>"
                        + "<div class=\"content\">wrapped2</div>"
                        + "</body></html>");

        String title = doc.getString("html.head.title");
        String content = doc.getString("html.body.div.find { it.@class == 'content' }");
        String content2 = doc.getString("**.findAll { it.@class == 'content' }[1]");

        assertEquals("Hello world", title);
        assertEquals("wrapped", content);
        assertEquals("wrapped2", content2);
    }
}
revau.lt
  • 2,674
  • 2
  • 20
  • 31
11

If you're using the DSL (given/when/then) then XmlPath with CompatibilityMode.HTML is used automatically if the response content-type header contains a html compatible media type (such as text/html). For example if /index.html contains the following html page:

<html>
    <title>My page</title>
    <body>Something</body>
</html>

then you can validate the title and body like this:

when().
        get("/index.html").
then().
        statusCode(200).
        body("html.title", equalTo("My page"), 
             "html.body",  equalTo("Something"));
Johan
  • 37,479
  • 32
  • 149
  • 237
0

Here is sample code with the latest rest assured apis, i.e. io.restassured and not the older jayway.restassured. The explanation for the code is in the code comments.

//Demo for an api which returns a json string inside html. The json string is just an array of objects.

import io.restassured.RestAssured;
import io.restassured.path.json.JsonPath;
import io.restassured.response.Response;

import java.util.List;

import static io.restassured.RestAssured.*;

public void testMyApi() {
    Response response =
            when().
                    get("www.myapi.com/data").
            then().
                    extract().
                    response();
    
    String bodyTxt = response.htmlPath().getString("body");//Get the body element of the html response.
    JsonPath jsonObj = new JsonPath(bodyTxt);//helps us to find things in a json string.

    List<String> rootItems = jsonObj.getList("$");//get root element of the json part.

    System.out.println(rootItems);
}
MasterJoe
  • 2,103
  • 5
  • 32
  • 58