0

I am coding a SPARQL query from Python using SPARQLWrapper. The endpoint is Uniprot, but 50% of time, Iget an error when executing the code :

def getReviewProt(accession): 
    #print(accession)  
    mystring = '(uniprot:' + ') (uniprot:'.join(accession) + ')'
    #print(mystring)
    sparql = SPARQLWrapper("http://sparql.uniprot.org/sparql")

    sparql.setQuery("""
                    PREFIX  up_core: <http://purl.uniprot.org/core/>
                    PREFIX  up_taxonomy: <http://purl.uniprot.org/taxonomy/>
                    PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                    PREFIX  owl:  <http://www.w3.org/2002/07/owl#>
                    PREFIX  apf:  <http://jena.hpl.hp.com/ARQ/property#>
                    PREFIX  xsd:  <http://www.w3.org/2001/XMLSchema#>
                    PREFIX  fn:   <http://www.w3.org/2005/xpath-functions#>
                    PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                    PREFIX  uniprot: <http://purl.uniprot.org/uniprot/>
                    PREFIX  dc:   <http://purl.org/dc/elements/1.1/>
                    SELECT ?is_true
                    WHERE
                        {
                         VALUES (?ac) {"""+mystring+"""}
                         ?ac  up_core:reviewed  ?is_true   
                        }
                    """) 
    sparql.setReturnFormat(JSON)
    results = sparql.query().convert()
    return results

if __name__ == '__main__' :
    import sys
    #print(sys.path)
    accession = ["Q6GZX4","Q96375","B1XBG2"]
    res = getReviewProt(accession)
    for r in res['results']['bindings']:
       print(r['is_true']['value'])


So I get this error :

QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed

When I try to see the exact error, here is what i get :

Exception:virtuoso.jdbc4.VirtuosoException: SQ156: Internal Optimized compiler error : sqlo table has no index in sqldf.c:3782.
Please report the statement compiled

The most strange is that when I try to execute it works, but like 50 % of the time. I get exactly the same error when making my query in the endpoint at this adress : http://sparql.uniprot.org/sparql . Sometimes it works perfectly, so I'm lost and of course, I want my program to work each time I execute it. They use the Virtuoso software in the endpoint, so I guess the problem comes from there, but I don't know how Virtuoso works. I'm new to SPARQL so it's quite hard for me to understand and resolve all errors, Can anyone help me ? Or if this problem has already been solved, I would be happy to have the link :) Thank you

sparkles
  • 3
  • 1
  • the query works most of the time ... It's obviously a very basic query, so maybe some of your entities are just not valid? I mean, you're using the prefixed form of URIs, so maybe some of those do contain illegal chars for the prefixed form? – UninformedUser Jul 20 '20 at 13:18
  • That said, the whole endpoint is a public free service, no guarantee to work all the time. The obvious way is to download the data and load it into your own triple store. That's the only way to guarantee 24/7. I mean, what do you do if they decide to shutdown the server or just do some maintenance for hours/days? But yes, it might also be simply a bug in the Virtuoso backend, though I would check your failing queries first. – UninformedUser Jul 20 '20 at 13:18
  • Hello @UninformedUser, thank you for your answer. I think the problem might be also because I'm using a python list as a parameter (mystring in the query), maybe that can lead to problems... But I used the same prefixes as Uniprot namespaces, so normally It's supposed to be the right way, no ? – sparkles Jul 20 '20 at 13:34
  • my point is, you can't do all chars in a prefixed form of a resource. Like `uniprot:foo/bar` is invalid with local form containing an unescaped `/` char. So my question, can you reproduce the error? I mean, you should be able to keep track of the elements of the Python list, right? – UninformedUser Jul 20 '20 at 14:34
  • Yes sure, the prefix `uniprot:` here is used wwith an accession number, a string of characters (only letters and number in upper case) that gives us directly the page of the protein so there can't be that kind of characters , I printed the items of my python list and there's no problem about this, I mean I would've seen It ithere was an error, as I have a little list, but I keep It in mind in case it would happen – sparkles Jul 21 '20 at 07:49

1 Answers1

0

The problem is not in your code, but in one of the two servers that run the sparql.uniprot.org endpoint. If you request went to the 'good' machine it worked, if it went to the 'broken' machine it failed. Both machines should be good now.

Jerven
  • 582
  • 3
  • 7