I'm a big fan of Jsoup. I only recently started using it and its amazing. I used to write some super hairy regex patterns to do pattern matching with because I wanted to avoid SAX like the plague... and that was quite tedious as you can imagine. Jsoup let me parse out specific items from a <table> in just a few lines of code.
Let's say I want to take the first 7 rows of a table where the <tr class=...> is GridItem or GridAltItem. Then, lets say we want to print the 1st, 2nd, and 3rd columns as text and then the first <a href> link that appears in the row. Sounds goofy, but I had to do this and I can do this easily:
String page = "... some html markup fetched from somewhere ...";
Document doc = Jsoup.parse(page);
for(int x=0; x< 7; x++) {
Element gridItem = doc.select("tr[class$=Item]").select("tr").get(x);
System.out.println("row: " + gridItem.select("td").get(0).text() + " " + gridItem.select("td").get(1).text() + " " + gridItem.select("td").get(4).text() + " " + gridItem.select("a").get(0).attr("href"));
}
Its that simple with Jsoup. Make sure you add the Jsoup jar file to your project as a library and import those classes which you need: you don't want to import the wrong Document or Element class...
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
Enjoy!