3

May I define (or calculate) the difference between DBPedia and Wikipedia data concerning, for example, Ecuadorian People?

Does DBpedia contain the same Ecuadorian People existing in Wikipedia? If not - what is the difference (e.g. how to extract all Ecuadorian People from Wikipedia)?

I could run SPARQL to calculate the number of Ecuadorian People from DBPedia but do not know how to do it for Wikipedia(what approach?)

GML-VS
  • 1,101
  • 1
  • 9
  • 34

1 Answers1

3

Wikipedia (specifically, mostly its infoboxes) is the source for the data in DBpedia. And that applies to each specific piece of information too. For example, take the first person on Wikipedia's List of Ecuadorians, Abdón Ubidia (Wikipedia, DBpedia):

  • DBpedia says that his birthPlace is Quito and Ecuador, because Wikipedia says that he was Born in Quito, Ecuador
  • DBpedia says that his nationality is Ecuadorian, because Wikipedia says that his Nationality is Ecuador

But Wikipedia is not completely consistent. If you take the following lists from Wikipedia:

  • people that were Born somewhere in Ecuador
  • people whose Nationality is Ecuador
  • people in List of Ecuadorians
  • people in subcategories of Category:Ecuadorian people

then it's very likely you will get 4 different lists of people.

svick
  • 236,525
  • 50
  • 385
  • 514
  • Also note, DBpedia is an evolving project. Extraction is currently a periodic event, not constant, so DBpedia and Wikipedia content can be out of sync for some time after an update to Wikipedia. Eventually, a properly written query against DBpedia should return a complete merge of the 4 Wikipedia lists -- so one query against DBpedia would effectively deliver a merge of the four Wikipedia lists. – TallTed Sep 04 '15 at 23:12
  • @svick This issue of consistency that you mentioned also holds for the DBPedia; right? – Daniel Dec 26 '16 at 18:05
  • @Daniel Like I said, Wikipedia is the source of data for DBPedia, so, yes, the issue does apply to DBPedia too. – svick Dec 26 '16 at 18:12