0

I am using Jsoup to scrape a gallery of pictures from this italian website

http://www.italiaebraica.org/index.php?option=com_phocagallery&view=category&id=3:famiglia-levi&Itemid=143&lang=it

in an AsyncTask with Jsoup i'm getting from the HTML all the urls of the images:

@Override
protected Void doInBackground(String... params) {

    Document doc;

    try {
        ConnectivityManager conMgr = (ConnectivityManager) mActivity
                .getSystemService(Context.CONNECTIVITY_SERVICE);

        if (conMgr.getActiveNetworkInfo() != null
                && conMgr.getActiveNetworkInfo().isAvailable()
                && conMgr.getActiveNetworkInfo().isConnected()) {
            doc = Jsoup
                    .connect(urlReceivedToConnect)
                    .timeout(0).get();
            Elements imgList = doc.getElementsByClass("phocagallery-box-file-third").select("img");
            photoList = new ArrayList<String>();
            ListIterator<Element> post = imgList.listIterator();

            while (post.hasNext()) {
                photoList.add(post.next().attr("abs:src"));
            }
        }
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    return null;
}

Then, in a costumized adapter, i'm taking this urlsList and i'm loading the images from the url that i'm putting in a gridView later:

private Drawable LoadImageFromURL(String url) {
    try {
        InputStream is = (InputStream) new URL(url).getContent();
        Drawable d = Drawable.createFromStream(is, "src");
        return d;
    } catch (Exception e) {
        System.out.println(e);
        return null;
    }
}

The problem is: some of the pictures are shown and are ok, but some others presents this error:

06-23 10:06:06.930: I/System.out(493): java.io.FileNotFoundException: http://www.italiaebraica.org/images/phocagallery/famiglia_levi/thumbs/phoca_thumb_m_Famiglia Levi 024.jpg

what's the problem? how can I get all the pictures in the right way? Please help, hope it is clear , i'm a junior developer!!

Hovercraft Full Of Eels
  • 283,665
  • 25
  • 256
  • 373

2 Answers2

0
 java.io.FileNotFoundException:

is pretty self-explanatory. Print out the urls so you can see the ones that cause the exception. It shouldn't take too long to debug.

I'm don't know what images exist and what don't, so you're the one who has to figure it out.

Oleksiy
  • 37,477
  • 22
  • 74
  • 122
  • I can already see the urls that cause the exception, the problem is that if i'm running them on the browser they ARE giving me an image, but java can't find them. Maybe I should parse somehow the urls to make them ok, or maybe there is another solution. – user2404626 Jun 23 '13 at 11:46
0

What is wrong is there are spaces in the URL. Most browsers are made to detect if there is a space and replace it with %20 so you won't get any error going to the URL in a browser. So I would recommend using:

private Drawable LoadImageFromURL(String url) {
    if(url.contains(" ")){
        url.replace(" ", "%20");
    }
    try {
        InputStream is = (InputStream) new URL(url).getContent();
        Drawable d = Drawable.createFromStream(is, "src");
        return d;
    } catch (Exception e) {
        System.out.println(e);
        return null;
    }
}
elliereiselt
  • 516
  • 5
  • 9