I'm working with WikiData and RDF for the first time. I downloaded the WikiData 24GB "truthy" dataset (available only in N-Triples .nt
format), but now I have a hard time understanding it.
Here are some lines from the .nt
file related to Jack Bauer showing (subject, predicate, object) triples:
<http://www.wikidata.org/entity/Q24> <http://schema.org/description> "protagonista della serie televisiva americana ''24''"@it .
<http://www.wikidata.org/entity/Q24> <http://schema.org/name> "Jack Bauer"@en .
<http://www.wikidata.org/entity/Q24> <http://www.wikidata.org/prop/direct/P27> <http://www.wikidata.org/entity/Q30> .
<http://www.wikidata.org/entity/Q24> <http://www.wikidata.org/prop/direct/P451> <http://www.wikidata.org/entity/Q284262> .
So my questions are:
- Are all the URIs for the triples resolvable to English from this one giant file, or do I have to make API calls? For example, I want to resolve this triple:
<http://www.wikidata.org/entity/Q24> <http://www.wikidata.org/prop/direct/P27> <http://www.wikidata.org/entity/Q30> .
into an English human-readable form like this:
Jack Bauer, country of citizenship, United States of America
Does this file contain the needed information to get the English-readable names for Q24
, P27
, and Q30
? Or would I have to make separate HTTP API calls to resolve them?
- I can also get a
.json
dump of this file. Am I correct in understanding is that the.nt
triples are simply a depth-first traversal of the JSON hierarchy to flatten all the data into triples?