1

My objective is to extract a list of ingredients from a recipe page using JSoup. I managed to get my first list entry from the website fine, however my for loop seems to stop at the first entry without gathering the next 5.

I'm not sure what I'm doing wrong, so I would be grateful if you could look at my code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;

public class WebScrape {
    public static void main(String[] args) {
        scrapeBBC("https://www.bbcgoodfood.com/recipes/spanish-omelette");
    }

    static void scrapeBBC(String url){
        try{
            Document recipe = Jsoup.connect(url).get();

            for(Element ingredients : recipe.select("section.recipe__ingredients.col-12.mt-md.col-lg-6")){
                //TODO: if problems occur with null entries add if-else as suggested in the video
                int row = 0;
                final String ingredient = ingredients.select(
                        "li.list-item--separator.list-item.pt-xxs.pb-xxs:nth-of-type("+ row++ +")").text();
                System.out.println(row + ingredients.select(
                        "li.list-item--separator.list-item.pt-xxs.pb-xxs:nth-of-type("+ row++ +")").text());

                //System.out.println(row + ingredient);
            }

        }catch(IOException ioe){
            System.out.println("Unable to connect to the URL.");
            ioe.printStackTrace();
        }
    }
}

Thanks in advance!

Zeroid
  • 13
  • 2

1 Answers1

0

Select the ingredients section first.

Element ingredients = recipe.select("section.recipe__ingredients.col-12.mt-md.col-lg-6").first();

Then iterate over the <li> elements present within that section.

int row = 0;
for (Element ingredient : ingredients.select("li.list-item--separator.list-item.pt-xxs.pb-xxs")) {
    System.out.println(++row + " : " + ingredient.text());
}

As an aside, your selectors don't have to be super specific; the following selectors would work just fine.

recipe.select("section.recipe__ingredients")
ingredients.select("li")
Ravi K Thapliyal
  • 51,095
  • 9
  • 76
  • 89