0

I'm trying to use the MediaWiki API to get articles from Wikipedia.

However, the JSON data that is returned contains the pageid as the identifier for the Object, but the pageid is random.

Here's what my code looks like:

$.getJSON("http://en.wikipedia.org/w/api.php?action=query&generator=random&grnnamespace=0&prop=extracts&exchars=50000&format=json&callback=?", function (data) {
    console.log(data.query.pages);
});

The JSON for the first page that is returned is like this:

37889571: Object
    extract: "<p></p>↵<p></p>↵<p><b>Frank A Delaney IV</b>, Born Feb 4, 1963, Denville New Jersey.</p>↵<p>Mr Delaney is the son of Frank A Delaney III, ..."
    ns: 0
    pageid: 37889571
    title: "Frank A Delaney IV"

I would like to grab properties like the extract or the title console.log(data.query.pages.37889571.extract);, but since the pageid is random, I'm not sure how I can reach it. What would be a good way to accomplish this?

For those who are interested, here is the JSON:

{

    "query":{
        "pages":{
            "1302977":{
                "pageid":1302977,
                "ns":0,
                "title":"Russ O'Hara",
                "extract":"<p><b>Russell Eugene Nealeigh O'Hara</b> is a one-time Los Angeles, USA radio personality whose career began at the forefront of the Boss Radio/Top 40 format.</p>\n<p>O'Hara is best known among LA radio listeners for his work on KKDJ<sup class=\"plainlinks noprint Inline-Template\" style=\"vertical-align:text-top;white-space:nowrap;\">[<i>disambiguation needed</i>]</sup> and his longtime solo work on KRLA where he went by the occasional moniker of \"Russ O'Hungry.\" Highly visible among Los Angeles radio personalities, O'Hara had the privilege of introducing headliners at major concerts, including Janis Joplin at the Hollywood Bowl; he also introduced The Rolling Stones, Jimi Hendrix and Three Dog Night among many others.</p>\n<p>He remains active as a broadcaster and currently hosts afternoons on KDES-FM, an oldies station in Palm Springs, California.</p>\n<p>O'Hara worked at the following Los Angeles stations in the course of his career:</p>\n<ul><li>(KBLA), 1964-65.</li>\n<li>KGFJ, 1968-60.</li>\n<li>KRLA, 1969-72.</li>\n<li>KKDJ<sup class=\"plainlinks noprint Inline-Template\" style=\"vertical-align:text-top;white-space:nowrap;\">[<i>disambiguation needed</i>]</sup>, 1972-74.</li>\n<li>KEZY (Anaheim) 1975-77.</li>\n<li>KROQ-FM, 1978-79.</li>\n<li>Return to KRLA, 1981\u201382; 1992-93.</li>\n<li>KEZN Palm Desert 1993-2001.</li>\n</ul><h2>External links</h2>\n<ul><li>KDES webpage</li>\n</ul><p><br></p>"
            }
        }
    }

}
Hot Licks
  • 47,103
  • 17
  • 93
  • 151
theintellects
  • 1,320
  • 2
  • 16
  • 28
  • As I read your (rather poor) rendition of the JSON, you have a JSON object that contains more JSON objects. One of the contained objects is named 37889571. That object in turn contains a string named "extract". You can always iterate through all the entries in the outer object and get all the "extract" strings. But obviously to get a particular one you need to know its "name" or some other identifier. – Hot Licks Nov 24 '13 at 03:14
  • A new site is needed for the dozens of nearly identical JSON questions that come in every day: jsonoverflow.com – Dexygen Nov 24 '13 at 03:28
  • Related: [This recent question](http://stackoverflow.com/questions/20010839/reach-a-string-behind-unknown-value-in-json) asks essentially the same thing, but for Python. – Ilmari Karonen Nov 25 '13 at 22:14

2 Answers2

1

I'm able to get to the properties by using:

for(var id in data.query.pages) {
  console.log(data.query.pages[id]);
}

but I'm not certain if that's the best/only way.

theintellects
  • 1,320
  • 2
  • 16
  • 28
  • This is the way I would recommend myself. That said, if you _don't_ want to use a _for..in_ loop for some reason, you _can_ also ask MediaWiki to [give you an explicit list of the page IDs](https://www.mediawiki.org/wiki/API:Query#Getting_a_list_of_page_IDs) in a separate part of the response. – Ilmari Karonen Nov 25 '13 at 22:17
0

You can now add &indexpageids in your request and you will get the pageids.

For example:
https://en.wikipedia.org/w/api.php?action=query&titles=Leipzig&prop=pageimages&format=json&pithumbsize=1000&indexpageids

Hunter
  • 3,080
  • 20
  • 23