0

I have an exercise in Semantic Web. I must extract some individuals from the DBpedia. These individuals must be inserted into an ontology that I must create. My question is. Can I retrieve individuals from the DBedia?

Let me clarify !

When I send this sparql query

PREFIX dbo: <http://dbpedia.org/ontology>
SELECT distinct * WHERE
{
    ?album a dbo:Album . 
} LIMIT 10

I get 10 URIs. Should I get whole instances ? I mean, label, object properties, data properties etc. in order to insert them to my ontology?

I want them as a complete instance. I don't want to add more variables e.g

?album dbo:artist ?artist . 

Can I use a java api e.g. Jena ?

EDIT:

Let me give you an example. Suppose that you get an Album with URI

http://dbpedia.org/resource/...Baby_One_More_Time_(album)

This album has also some properties with their values e.g.

dbo:artist   dbr:Britney_Spears
dbo:releaseDate 1999-01-12 (xsd:date)
...

How could I get all of them in order to create an indivual album for my ontology with properties artist and releaseDate and values Britney_Spears and 1999-01-12 respectively ?

2 Answers2

2

Well, a good point always to start is your requirements! What do you exactly need? There is scientific plethora research on Ontology Module Extraction (see for example here).

My rule of thumb is that: the amount you extract must align with the required constraints of soundness and completeness of results, which in turn, aligns with your requirements. To make it clear, consider the following: A DBpedia Artist is a subClassOf Person. Now consider that you extract all the instances of Artist from DBPedia, without the piece of information that Artist is a subClassOf Person. Now if you query your dataset asking for Person, you will get nothing. Is this a sound result? yes, but is it complete? No! However, if you don't care about the fact that each Artist is a Person, then it's okay. A mentioning worthy thing is that it depends on the DBpedia endpoint itself and what kind of reasoning it performs as well.

Concluding: Specify what you really need. While you can suffice for a couple of classes with their instances, you can as well extract the whole DBpedia.

Regarding getting the data, there are multiple ways; again depending on your requirements. For simple purposes, you can use Jena TDB for triples storage and access them via Jena. You can even store your data simply in an RDF file. You can, for example, use a construct query on DBpedia endpoint and specify the results format as RDF and then insert them to your RDF engine. Another option, for example, this answer, states how to use an INSERT query to perform the insert task into a local graph.

Median Hilal
  • 1,483
  • 9
  • 17
1
  1. You can retrieve instances from DBpedia with whatever metadata you want, but it depends on your ontology that you would like to create. Please take a look at this document, it will help you to understand some notions.
  2. Should you get whole instances? I think you are asking if you should take all the proporties and objects depending on the subject. Not necessarily..It depends on your ontology as stated in first step and you decide what to take.
  3. Should you use Jena? You can but you don't have to! If you pose a CONSTRUCT query to the endpoint you can get the data but as far as I understood you don't want to add variables. So you can pose a query as follows by asking all the metadata regarding to the instance.

    CONSTRUCT { ?album ?p ?o } WHERE { ?album a dbo:Album . ?album ?p ?o }

  4. If you would like to get a limited number of instances then you can add limit again at the end of this query.

Erwarth
  • 547
  • 6
  • 18