4

I'm currently working with well structured RDF data on a OntoWiki knowledge base. I'm interested to import these data into a local wikidata. How is it possible? I didn't find a proper documentation.

AFAIK, it seems Wikidata has MariaDB as backend and generate triple from it to benefit the SPARQL service. Is exist a tool to do bulk imports from RDF or JSON files into wikidata? If yes, where it exists the documentation to do this? The amount of data is too big to be done by hand, but the advantage it is the data are well structured.

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
Cyril
  • 485
  • 4
  • 15
  • Years ago, I experienced terrible performance with RDF on MySQL. Please let us know how things go. – Rick James Jun 05 '18 at 16:54
  • Part of the answer (only part): https://www.wikidata.org/wiki/Wikidata:Data_Import_Guide – Gilles-Antoine Nys Jun 06 '18 at 09:52
  • I read the data import guide, but it proposes only an import through data sheet and at best semi-automatic import. What I imagined is to transform my RDF triplestore into a JSON dataset I can directly import in MariaDB (that is AFAIK the real backend of wikidata/wikipedia) with a minimum of interventions. But I lack documentation to determine how to do it, at least to know if it is possible? Clearly, it is not possible for me to validate my millions of statements. Maybe a personal BOT import is a solution. But I didn't find document yet. – Cyril Jun 06 '18 at 14:40
  • I read https://www.mediawiki.org/wiki/Wikibase/DataModel/JSON. I questioned myself. If it is possible to export wikidata into a dump JSON file, if I have my own local server running wikidata, how to import this DUMP JSON file into my own wikidata server. I spent several hours to find this without success. My idea is to generate a file similar in the structure than this dump json file and import it. – Cyril Jun 06 '18 at 16:12
  • When you mention "JSON" do you mean "JSON-LD" ? https://json-ld.org/ – Gilles-Antoine Nys Jun 06 '18 at 19:09
  • Because the wikibase provide [JSON specifications](https://www.mediawiki.org/wiki/Wikibase/DataModel/JSON) for exports, I'm wondering if it exist a way to import data from JSON format. More I search more I believe the answer is negative. AFAIK wikidata works with pages of item and the data is also structured as pages and Wikidata is still SQL base platform. RDF triples are just extracted and stored in a Blazegraph's triplestore to perform SPARQL requests or inferences. So, I'm pretty sure it is not possible to import either JSON-LD or RDF. If I'm wrong, it'll be a good new. – Cyril Jun 07 '18 at 09:26
  • You won't be able to do an import directly, but tools like https://github.com/SuLab/WikidataIntegrator should be able to help you out. You have to perform the mapping going from whatever data you have to wikibase entities, so it will never be as simple as hand a Wikibase some Json and it is magically converted. – Addshore Nov 13 '18 at 23:29

1 Answers1

2

WikiData is natively based on Blazegraph which is a triplestore. (The people who did blazegraph were aqui-hired by amazon to create Amazon Neptune).

The data storage engine of WikiData is available as WikiBase and there is a whole set of different tools available for this environment.

So it depends a lot on your environment and technology stack what might be suitable for you. IMHO your main task will be to translate from OntoWiki's view of the world to the one of Wikidata. You'll find an overview on WikiData's internal structure at http://wiki.bitplan.com/index.php/WikiData (where i am the author)

Here is an example Triple as it is stored in multi-language format:

Example Wikidata entry

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186