1

I'm trying to retrieve images from wikimedia using the existing api, but there seems to be no logic in what works and what doesn't.

Here's what i'm doing/have tried:

I'm getting a query of images, from this url.

http://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&format=xml&ailimit=10&aifrom=jura

This returns an xml feed, from which i get the image names and url's:

<img name="Jura.PNG" url="http://upload.wikimedia.org/wikipedia/en/a/ad/Jura.PNG" descriptionurl="http://en.wikipedia.org/wiki/File:Jura.PNG"/>

Then, to get the information, such as uploader and license, i use this tool, as linked on wikimedia:

http://toolserver.org/~magnus/commonsapi.php

It requires the parameter ?image=, followed by a filename. Jura.PNG from the xml example, works fine. However, most other one's i try just return <error>File does not exist</error>. I've tested, the files do exist. I can't figure out why one file works, and another doesn't.

For testing, another one that works is Calumma_tarzan_01.jpg.

Does anyone know what i'm doing wrong?


Not working examples:

Jurassic.jpg
Juramento_de_la_Primera_Junta.jpg
JuraDolois_logo.jpg

Used php code:

$xml_link = "http://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&format=xml&ailimit=10&aifrom=".$search_term;
$xml = simplexml_load_file( $xml_link );
$xml_link_data = "http://toolserver.org/~magnus/commonsapi.php?image=".$imgname;
$xml_data = simplexml_load_file( $xml_link_data );
var_dump($xml_data->licenses->license);

for the Jura.PNG example the correct object is dumped, but since other urls dont have the license part, the result is obviously NULL. I think the problem isn't here, however, since manual input of the url in a browser also won't return results.

Damjan Pavlica
  • 31,277
  • 10
  • 71
  • 76
Lg102
  • 4,733
  • 3
  • 38
  • 61
  • Can you give us an example of one that *doesn't* work? They don't happen to have spaces or other non-URL-friendly characters in them, do they? I'm wondering if you're forgetting to urlencode the filename you're passing... Also, what language are you using to fire off the request to the toolserver.org API? Can you show us a bit of code? – Matt Gibson Sep 09 '11 at 14:44
  • I have added the requested information to the question. As you can see, the `Jurassic.jpg` file doesn't have any fancy characters (except maybe the capital, but that doesn't seem to make any difference), and it still doesn't work. – Lg102 Sep 09 '11 at 14:59
  • So far, looking at [the source](https://svn.toolserver.org/svnroot/magnus/commonsapi.php) it seems that retrieving from `http://commons.wikimedia.org/w/api.php?format=php&action=query&prop=imageinfo&iilimit=500&iiprop=timestamp|user|url|size|sha1|metadata&titles=Image:Jura.PNG` works, returning valid "imageinfo", and `http://commons.wikimedia.org/w/api.php?format=php&action=query&prop=imageinfo&iilimit=500&iiprop=timestamp|user|url|size|sha1|metadata&titles=Image:Jurassic.jpg` doesn't, but I don't know enough about the Commons API to know why. – Matt Gibson Sep 09 '11 at 15:17

1 Answers1

1

Wikimedia Commons is a repository of free images that Wikipedia uses. It is primarily used for sharing of images between national mutations of Wikipedia.

Many images on the English Wikipedia are used under fair use. Those are not free, so they can't be put on Commons.

The tool you have found works only on images from Commons (as its name suggests), so it can't be used for images that are hosted on the English (or any other) Wikipedia.

svick
  • 236,525
  • 50
  • 385
  • 514
  • Thats what i thought at first, however, if you check the xml response, you'll see that every image is infact on the upload.wikimedia.org subdomain, and not on wikipedia itself. – Lg102 Sep 09 '11 at 19:49
  • That doesn't say much. It's still the same servers, but it's different site. Also, the XML lists only images directly on Wikipedia, not those that are on Commons. You can also notice that images from Wikipedia are under `http://upload.wikimedia.org/wikipedia/en/`, while those from Commons are under `http://upload.wikimedia.org/wikipedia/commons/`. – svick Sep 09 '11 at 20:18
  • The *working*`Jura.PNG is also on the `/wikipedia/en/` location, so i think this is not completely correct. Or am i missing your point here? – Lg102 Sep 09 '11 at 21:34
  • 2
    Jura.PNG works only by accident. There are in fact two distinct images: [one on Wikipedia](http://en.wikipedia.org/wiki/File:Jura.PNG) and [one on Commons](http://commons.wikimedia.org/wiki/File:Jura.PNG) with the same name. So it works, but it doesn't work the way you want it to. – svick Sep 09 '11 at 21:41
  • Thank you for this explaination. I think this means i have to use an wikimedia commons only api, or find a way to get the info for the wikipedia images. I'll add my findings. – Lg102 Sep 10 '11 at 10:50