I have some basics of programming, but I am completely new to RDF or Sparql, so I hope to be clear in what follows. I am trying to download some data available at http://data.camera.it/data/en/datasets/, and all the data are organized in rdf-xml format, in an ontology.
I noticed this website has a SPARQL Query Editor online (http://dati.camera.it/sparql), and using some of their examples I was able to retrieve and convert some of the data I need using Python. I used the following code and query, using SparqlWrapper
from SPARQLWrapper import SPARQLWrapper, JSON
sparql = SPARQLWrapper("http://dati.camera.it/sparql")
sparql.setQuery(
'''
SELECT distinct ?deputatoId ?cognome ?nome ?data ?argomento titoloSeduta ?testo
WHERE {
?dibattito a ocd:dibattito; ocd:rif_leg <http://dati.camera.it/ocd/legislatura.rdf/repubblica_17>.
?dibattito ocd:rif_discussione ?discussione.
?discussione ocd:rif_seduta ?seduta.
?seduta dc:date ?data; dc:title ?titoloSeduta.
?seduta ocd:rif_assemblea ?assemblea.
?discussione rdfs:label ?argomento.
?discussione ocd:rif_intervento ?intervento.
?intervento ocd:rif_deputato ?deputatoId; dc:relation ?testo.
?deputatoId foaf:firstName ?nome; foaf:surname ?cognome .
}
ORDER BY ?data ?cognome ?nome
LIMIT 100
'''
)
sparql.setReturnFormat(JSON)
results_raw = sparql.query().convert()
However, I have a problem because the website allows only to download 10,000 values. As far as I understood, this limit cannot be modified. Therefore I decided to download the datasets on my computer. I tried to work on all these rdf files, but I don't know how to do it, since, as far as I know, the SparqlWrapper does not work with local files.
So my questions are:
- How do I create a dataset containing all the RDF files so that I can work on them as if it were a single object?
- How do I query on such an object to retrieve the information I need? Is that possible?
- Is this way of reasoning the right approach?
Any suggestion on how to tackle the problem is appreciated. Thank you!