0

I'm trying to retrieve results from the BNCF at this endpoint.

My query (with "ab" as example) is:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?source ?label ?content
                WHERE {
                    ?source a skos:Concept;
                        skos:prefLabel ?label; 
                        skos:scopeNote ?content.
                FILTER regex(str(?label), "ab", "i")
            }

The query is correct in fact if you try to run it works. But when I try to get the results from my python this is the error:

SyntaxError: JSON Parse error: Unexpected EOF

This is my python code:

__3store = "http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query"
sparql = SPARQLUpdateStore(queryEndpoint=__3store)
sparql.setReturnFormat(JSON)
results = sparql.query(query_rdf).convert()
print json.dumps(result, separators=(',',':'))

I tried the code above according to this answer, before my code was like this:

__3store = "http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query"
sparql = SPARQLWrapper(__3store,returnFormat="json")
sparql.setQuery(query_rdf)
result = sparql.query().convert() 
print json.dumps(result, separators=(',',':'))

but both throw the same error.

Does anyone know how to fix it? Thanks

EDIT:

This is python code, hope it is enough to understand

import sys
sys.path.append ('cgi/lib')
import rdflib
from rdflib.plugins.stores.sparqlstore import SPARQLUpdateStore, SPARQLStore
import json
from SPARQLWrapper import SPARQLWrapper, JSON

#MAIN
print "Content-type: application/json"
print
prefix_SKOS =       "prefix skos:      <http://www.w3.org/2004/02/skos/core#>"
crlf = "\n"
query_rdf = ""
query_rdf += prefix_SKOS + crlf
query_rdf += '''
            SELECT DISTINCT ?source ?title ?content
                WHERE {
                    ?source a skos:Concept;
                        skos:prefLabel ?title; 
                        skos:scopeNote ?content.
                FILTER regex(str(?title), "ab", "i")
            }

        '''
__3store = "http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query"
sparql = SPARQLWrapper(__3store,returnFormat="json")
sparql.setQuery(query_rdf)
result = sparql.query().convert() 

print result

Running this in Python shell returns:

Content-type: application/json


Warning (from warnings module):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/SPARQLWrapper-1.6.4-py2.7.egg/SPARQLWrapper/Wrapper.py", line 689
RuntimeWarning: Format requested was JSON, but XML (application/sparql-results+xml;charset=UTF-8) has been returned by the endpoint
<xml.dom.minidom.Document instance at 0x105add710>

So I think the result is always an XML also if I specificied Json as a return format.

Community
  • 1
  • 1
Gio Bact
  • 541
  • 1
  • 7
  • 23
  • Can you see the actual JSON that's being returned? – Joshua Taylor Feb 11 '15 at 13:51
  • Also, it'd be nice if you can provide a minimal example that we can actually run to reproduce the problem. It can't be much more than the code you've actually shown. Does your query string also include the necessary prefixes? Sometimes endpoints will define them for interactive/web-based use (though this one doesn't seem to do even that), but typically don't for queries sent to the endpoint. – Joshua Taylor Feb 11 '15 at 13:52
  • No I can't see the actual JSON, this error is thrown by the `error:` of my Ajax call. So it seems to me that it can't execute the query at all because it never get a `success:`. – Gio Bact Feb 11 '15 at 13:56
  • Yes, query includes prefixes. Sorry, I'll update the question. – Gio Bact Feb 11 '15 at 13:56
  • Thank you for the prefixes. It'd still be helpful to have a minimal code sample that we can simply copy and run, though. It's hard to debug code that we can't run. – Joshua Taylor Feb 11 '15 at 14:19
  • I think the endpoint always return an XML result, also if I specified JSON as returnFormat – Gio Bact Feb 11 '15 at 14:44
  • Ok I added some python code to run. – Gio Bact Feb 11 '15 at 14:52

1 Answers1

0

There are a couple of problems playing together here:

First, you should only use SPARQLUpdateStore from rdflib if you want to access a SPARQL store via rdflib's Graph interface (e.g., you can add triples, you can iterate over them, etc.). If you want to write a SPARQL query yourself you should use SPARQLWrapper.

Second, if you ask SPARQLWrapper to return JSON, what it does is actually ask the server for a couple of mime types that are most common and standardized for what we just call "json", as shown here and here:

_SPARQL_JSON = ["application/sparql-results+json", "text/javascript", "application/json"]

It seems as if your sever does understand application/sparql-results+json, but not a combined "give me any of these mime-types header" as rdflib compiles it for maximum interoperability (so your server essentially doesn't fully support HTTP Accept Headers):

curl -i -G -H 'Accept: application/sparql-results+json' --data-urlencode 'query=PREFIX skos: 
<http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?source ?label ?content
WHERE {
 ?source a skos:Concept;
 skos:prefLabel ?label;
 skos:scopeNote ?content.
 FILTER regex(str(?label), "ab", "i")
}' http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query

will return:

HTTP/1.1 200 OK
Date: Mon, 18 May 2015 13:13:45 GMT
Server: Apache/2.2.17 (Unix) PHP/5.3.6 mod_jk/1.2.31
...
Content-Type: application/sparql-results+json;charset=UTF-8

{
  "head" : {
    "vars" : [ ],
    "vars" : [ "source", "label", "content" ],
    "link" : [ "info" ]
  },
  "results" : {
    "bindings" : [ {
      "content" : {
        "type" : "literal",
        "value" : "Il lasciare ingiustificatamente qualcuno o qualcosa di cui si è responsabili"
      },
      "source" : {
        "type" : "uri",
        "value" : "http://purl.org/bncf/tid/12445"
      },
      "label" : {
        "xml:lang" : "it",
        "type" : "literal",
        "value" : "Abbandono"
      }
    },
...

so everything is ok, but if we ask for the combined, more interoperable mime types:

curl -i -G -H 'Accept: application/sparql-results+json,text/javascript,application/json' --data-urlencode 'query=PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?source ?label ?content
WHERE {
 ?source a skos:Concept;
 skos:prefLabel ?label;
 skos:scopeNote ?content.
 FILTER regex(str(?label), "ab", "i")
}' http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query

we get an xml result:

HTTP/1.1 200 OK
Server: Apache/2.2.17 (Unix) PHP/5.3.6 mod_jk/1.2.31
...
Content-Type: application/sparql-results+xml;charset=UTF-8

<?xml version='1.0' encoding='UTF-8'?>
...

So long story short: it's a bug in the server you're using. The following is a nasty workaround (it seems SPARQLWrapper doesn't just allow us to manually set the headers, but unconditionally overrides them in _createRequest), but it works:

In [1]: import SPARQLWrapper as sw

In [2]: sparql = sw.SPARQLWrapper("http://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS_03_2014/query")

In [3]: sparql.setReturnFormat(sw.JSON)

In [4]: sparql.setQuery('''                                                                                                     PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?source ?label ?content
                WHERE {
                    ?source a skos:Concept;
                        skos:prefLabel ?label;
                        skos:scopeNote ?content.
                FILTER regex(str(?label), "ab", "i")
            }
''')

In [5]: request = sparql._createRequest()

In [6]: request.add_header('Accept', 'application/sparql-results+json')

In [7]: from urllib2 import urlopen

In [8]: response = urlopen(request)

In [9]: res = sw.Wrapper.QueryResult((response, sparql.returnFormat))

In [10]: result = res.convert()

In [11]: result
Out[11]:
{u'head': {u'link': [u'info'], u'vars': [u'source', u'label', u'content']},
 u'results': {u'bindings': [{u'content': {u'type': u'literal',
     u'value': u'Il lasciare ingiustificatamente qualcuno o qualcosa di cui si \xe8 responsabili'},
    u'label': {u'type': u'literal',
     u'value': u'Abbandono',
     u'xml:lang': u'it'},
    u'source': {u'type': u'uri', u'value': u'http://purl.org/bncf/tid/12445'}},
   ...
Jörn Hees
  • 3,338
  • 22
  • 44