0

I just downloaded the Wikidata JSON dump and parsed out a snippet of JSON:

Just one entity is 50,000 lines of JSON, the whole file is 80GB compressed, too large to fit in memory. I want to "scrape" this JSON for data, but looking at it I don't see anything. It looks like this first item is https://www.wikidata.org/wiki/Q31, but I don't see how to get the data out of the JSON into what I'm looking for.

Can you show a quick JavaScript function for parsing out the "instance of" names (or just IDs) using this data?

Other than that, what do the different keys/fields mean? Where do I find docs even? If no docs, could you list what the fields mean?

Lance
  • 75,200
  • 93
  • 289
  • 503
  • 2
    Did you get the dump from here: https://www.wikidata.org/wiki/Wikidata:Database_download ? If so, I'd recommend following the link on that page from the sentence "Please refer to the JSON structure documentation[1] for information about how Wikidata entities are represented." [1] https://www.mediawiki.org/wiki/Wikibase/DataModel/JSON – Tom Morris Oct 28 '20 at 22:09
  • Ah there it is, I could not find that! – Lance Oct 28 '20 at 22:42

0 Answers0