I'm creating a class using jsoup that will do the following:
- The constructor opens a connection to a url.
- I have a method that will check the status of the page. i.e. 200, 404 etc.
- I have a method to parse the page and return a list of urls.#
Below is a rough working of what I am trying to do, not its very rough as I've been trying a lot of different things
public class ParsePage {
private String path;
Connection.Response response = null;
private ParsePage(String langLocale){
try {
response = Jsoup.connect(path)
.userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
.timeout(10000)
.execute();
} catch (IOException e) {
System.out.println("io - "+e);
}
}
public int getSitemapStatus(){
int statusCode = response.statusCode();
return statusCode;
}
public ArrayList<String> getUrls(){
ArrayList<String> urls = new ArrayList<String>();
}
}
As you can see I can get the page status, but using the already open connection from the constructor I don't know how to get the document to parse, I tried using:
Document doc = connection.get();
But that's a no go. Any suggestions? Or better ways to go about this?