1

I'm using SPARQLWrapper to query a local SPARQL endpoint (using apache-jena-fuseki), and some of my queries are CONSTRUCT queries.

The query will give me valid results on web-based SPARQL interface, e.g. yasgui. When using SPARQLWrapper, the default query method will give me this error:

Response:
b'Error 400: Failed to write output in RDF/XML: Only well-formed absolute URIrefs can be included in RDF/XML output: <arcp://uuid,00000000-0000-0000-0000-000000000000/> Code: 28/NOT_DNS_NAME in HOST: The host component did not meet the restrictions on DNS names.\n'

(I have replaced the UUID with 0.)

I found this post. Unfortunately the source data is out of my control so I cannot change its content easily -- it is CWL-Prov and its standard tells it to use this representation. Therefore, I need to use other return formats. I tried N-Triples and Turtle formats on yasgui, and they work there.

However, when setting the return format on SPARQLWrapper, problem occurs. If I set it to anything other than SPARQLWrapper.XML, it returns this error (using N3 as an example):

Response:
b"Error 400: Can't determine output content type: n3\n"

(JSON is not supported for CONSTRUCT query.)

If I use a custom string other than the given ones, it will fallback to XML automatically (as described in its document).

The error is generated by fuseki, so I believe maybe I have done something wrong. Does anyone experience this and how can it be solved?


The code snippet I'm using to do the query:

import SPARQLWrapper

sparql = SPARQLWrapper.SPARQLWrapper('http://localhost:3030/prov')
#query = '' # The CONSTRUCT query here
sparql.setQuery(query)
sparql.setReturnFormat(SPARQLWrapper.N3)
return sparql.query().convert()

As suggested by @AndyS, I replaced N3 with Turtle, but error still occurs. Running fuseki with -v, here is what I get:

[2020-11-04 17:02:22] Fuseki     INFO  [1]   => User-Agent:          sparqlwrapper 1.8.5 (rdflib.github.io/sparqlwrapper)
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Connection:          close
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Host:                127.0.0.1:3030
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Accept-Encoding:     identity
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Accept:              application/turtle,text/turtle
[2020-11-04 17:02:22] Fuseki     WARN  SPARQL Query: Unrecognize request parameter (ignored): results
[2020-11-04 17:02:22] Fuseki     INFO  [1] Query = 

MY-ORIGINAL-QUERY-OMITTED

[2020-11-04 17:02:22] Fuseki     INFO  [1]   <= Vary:                Accept,Accept-Encoding,Accept-Charset
[2020-11-04 17:02:22] Fuseki     INFO  [1] 400 Can't determine output content type: turtle (165 ms)

I copied the printed query, and it works on YASGUI. There are also some errors on URI/IRI scheme violation, which I omitted here.

I saw these extra query parameters at the end of the query URL:

&format=turtle&output=turtle&results=turtle

Maybe they are related to the error? But why doesn't fuseki complain about format and output (like for results) nor prints them (like for query)?

renyuneyun
  • 608
  • 5
  • 18
  • I'm confused - you loaded the data into a local Fuseki triple store, right? If so, which version? And if so, why shouldn't you be able to get any by standard SPARQL supported response type? – UninformedUser Nov 02 '20 at 15:34
  • (1) Run Fuseki with "-v" to see what the HTTP headers being sent actually are. (2) ask for Turtle, not N3. Getting RDF/XML suggests it isn't a MIME type Fuseki supports (it provides the standard ones and data that is N3-like is Turtle althouhg SPARQLWrapper does seem to put in Turtle for N3 at least in the version I looked at). – AndyS Nov 02 '20 at 18:37
  • @UninformedUser Yes. I'm using fueski 3.13.1, installed from AUR (of archlinux). Fuseki seems to be correctly working, and I can get responses using YASGUI or Fuseki's built-in web interface. The problem happens when I'm using SPARQLWrapper -- my system is written in Python. And it won't occur with SELECT queries -- I'm using some CONSTRUCT queries. – renyuneyun Nov 04 '20 at 09:01
  • @AndyS Thanks for the suggestion. I rerun the code, and the error is still there. I updated the description and added the new information at the end. – renyuneyun Nov 04 '20 at 09:19
  • Ignore the WARN - that's just noticng the "results" which is not a Fuseki-understood query parameter. It's warning, it does not change the processing other than to ignore it. Fuseki uses "format" and "output" but the ideal is not to have them at all and use the Accept header. – AndyS Nov 04 '20 at 16:50
  • 1
    The real problem is that "turtle" isn't a legal for "output" (or "format"). "ttl" is. Tis isn't really what this mechanism is for and the ideal is that sparqlwrapper used "Accept:" and not alternative, non-standard, "?format". SPARQLWrapper has `setOnlyConneg` so try setting that to true. The then result/output/format query params should not be added. Or the bad-style: `Wrapper._returnFormatSetting = ()` Or fix the data because "arcp://" means the next part is a host name and "uuid,00000000-0000-0000-0000-000000000000" isn't a legal host name (the comma is bad) – AndyS Nov 04 '20 at 17:16
  • BTW: Fuseki 3.17.0 will work because I'll add "turtle" to the short name handling but the `setOnlyConneg` is a much better solution. – AndyS Nov 04 '20 at 17:21
  • Please report whether `setOnlyConneg` works because if so, I'll turn this into an answer for the record. – AndyS Nov 04 '20 at 17:22
  • @AndyS Thanks a lot. That works, though the return type is not rdflib.Graph which is a bit surprising in the first place. I managed to find the way to convert it to Graph from the document later. – renyuneyun Nov 05 '20 at 11:03
  • @AndyS For changing the original data, sorry that's not possible for my case. As also mentioned in the original question, the data is CWL-Prov, and this strange thing was in the standard (which they mentioned is from "arcp" by IETF -- see the text [here](https://github.com/common-workflow-language/cwlprov/blob/main/prov.md#cwlprov-namespaces)). – renyuneyun Nov 05 '20 at 11:11
  • Thanks - answer written and hopefully anyone else running into this issues will be helped. – AndyS Nov 05 '20 at 17:35

1 Answers1

1

SPARQLWrapper defaults to adding

&format=turtle&output=turtle&results=turtle

to the request.

SPARQLWrapper has a method setOnlyConneg that turns off the adding of the additional query string parts.

  1. The WARN SPARQL Query: Unrecognize request parameter (ignored): results happens because Fuseki does understand results and logs a warning about it. It is just a warning.

  2. format is a mechanism to override the proper HTTP content negotiation mechanism because in some situations it is hard to set the HTTP headers. This does not apply to SPARQLWrapper which does correctly set Accept:.

  3. format=turtle isn't in the list of names for a CONSTRUCT query. ttl is. (`turtle can be added to future version of Fuseki for completeness).

The best way is not to have the non-standard query string parameters with setOnlyConneg. SPARQLWrapper correctly sets the "Accept:" header in the request and Fuseki has content negotiation and will work with that header.

AndyS
  • 16,345
  • 17
  • 21