0

The code I've written uses jsoup to go to a site, views all the paragraph headings, then saves them to an ArrayList called headingList. Here's the tricky part. I've got a map that takes Strings as the keys and ArrayLists as the values. The code is designed in a way that requires it to go to more than one page. Because of this, the headings amount may vary greatly as well as the paragraph amount that is tied to the headings. So, the idea here was to create two int values. One int value called headingAmt is set after it views the page and determines how many headings there are. The second int value called headCount is initialized to a value of 1. Then what I'm trying to do is set a while loop like this: while(headCount != headAmt + 1) and increment it at the end of the loop so that it terminates when headCount goes through each heading. During the while loop I'm trying to go through and add each paragraph to an ArrayList called items, then take what's in the items arrayList and then set it as the values for the first item in the map. Then, clear the ArrayList, go to the next paragraph save what's there to items then set that ArrayList for the values for the second item in the map and so on and so on. I have code I can post, but it's confusing since the while loop in question has been rearranged so many times since I can't get it to work properly.

Edit Here's the code in case anyone can help:

public class Finder {

public Finder(String url) {
    String mainURL = "http://www.website.com";
    Map<String, List<String> > headMap  = new HashMap<>();
    ArrayList<String> headingList = new ArrayList<>();
    ArrayList<String> items = new ArrayList<>();
    int headCounter = 1;

    String itemList = "div > div:nth-child(1).category > ul:nth-child(2) > li.item > span";
    int headAmt;



    Document doc1 = null;


    ///// Connect to site to get menu /////
    try{
        doc1 = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36")
                .referrer("http://www.google.com")
                .get();
    }
    catch(IOException e){
        System.out.println("Can't connect to website");
    }

        /////// Get headings ////////
        Elements head = doc1.select("div > div > div > h3");

        ////// Loop through headings and add to ArrayList /////
        for(Element e: head){
            headingList.add(e.text());

        }
        headAmt = headingList.size();

        /*
        Here is the problem
         */

        while(headCounter != headAmt + 1){

            Elements elem = doc1.select("div > div:nth-child("+ headCounter +").category > ul:nth-child(2) > li.item > span");


            for (String key : headingList) {
                for(Element e : elem){
                items.add(e.text());
                }

                List<String> value = new ArrayList<>(items);
                headMap.put(key, value);
            }

            items.clear();
            headCounter++;
            }
            }
        }
    }
}
REAL O G
  • 693
  • 7
  • 23

1 Answers1

1

You can try something like this:

public class Finder {
public static void main(String[] args) {
    new Finder(
            "http://www.allmenus.com/ny/new-york/250087-forlinis-restaurant/menu/");
}

public Finder(String url) {
    Document doc1 = null;
    try {
        doc1 = Jsoup
                .connect(url)
                .userAgent(
                        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36")
                .referrer("http://www.google.com").get();
    } catch (IOException e) {
        System.out.println("Can't connect to website");
    }
    Elements elements = doc1.select(".category");
    HashMap<String, ArrayList<List<String>>> menu = new HashMap<String, ArrayList<List<String>>>();
    for (Element e : elements) {
        String name = e.select(".category_head>h3").first().text();
        Elements itms = e.select("ul > li");
        ArrayList<List<String>> menuItems = new ArrayList<List<String>>();
        for (Element it : itms) {
            menuItems.add(Arrays.asList(new String[] {
                    it.select("span").first().text(),
                    it.select("span").eq(1).text() }));
        }
        menu.put(name, menuItems);

    }
    for (String key : menu.keySet()) {
        System.out.println(key);
        ArrayList<List<String>> lst = menu.get(key);
        for (List<String> item : lst) {
            System.out.println("       " + item.get(0) + " " + item.get(1));
        }
        System.out.println("\n");
    }
}
}
Titus
  • 22,031
  • 1
  • 23
  • 33
  • i posted the code - don't get hung up on that last while loop. I know it's wrong, I'm just needing help fixing it – REAL O G Jun 03 '15 at 19:33
  • sorry, i reverted the code back to what i had, but it still doesn't work properly. Any help is appreciated. I'm pulling my hair out here. – REAL O G Jun 03 '15 at 19:42
  • hmmm now when i print out the map the values show up as empty arrayLists. should I update the code in my post to show how I've got it now? – REAL O G Jun 03 '15 at 20:00
  • 1
    @GD that is probably because the `select(...)` method doesn't return any elements. – Titus Jun 03 '15 at 20:02
  • well before i changed it, all of the map values had arraylists in them that had actual strings in it. problem was, they were all the same list – REAL O G Jun 03 '15 at 20:03
  • 1
    @GD that was caused by the extra loop `for (String key : headingList)` – Titus Jun 03 '15 at 20:09