We are using Spacy for entity extraction in Python 3 for a non-English language. We get back a list of entities which we want to store in a database. To get a better understanding of the entities we want to use Wikidata or any other publicly available source to find the correct meaning of a word. Because Spacy is trained on a non-English language set a lot of words are seen as the same type (for example a location) while if we could match it with Wikidata we should see the location found by Spacy is actually a city or a point of interest. That way the data is much more detailed.
We tried to use different API's to find the answer to our question. One of them was https://nl.wikipedia.org/w/api.php?action=query&format=json&prop=pageprops&titles=Bologna . We expected that the we could use the Q347019 number in SPARQL to get the ontology (http://dbpedia.org/ontology/City) . But we couldn't find the SPARQL that gave back the results we wanted and it's not ideal to need multiple requests to get the data we want.
We also tried http://lookup.dbpedia.org/api/search.asmx/PrefixSearch?QueryClass=&MaxHits=1&QueryString=Bologna , but this API seems to give back different formats when we query the wide range of entities which makes it difficult to automatically match the response we want and story it in the database. It does give back the ontologies we are looking for.
I am looking for an efficient way to query any Wikipedia / wikidata / dbpedia source (a non-commercial one) to get the ontology urls (http://dbpedia.org/ontology/City) of an entity based on a string ("Bologna").