2

I have triples like this, where the object is an anyURI-typed string representation of a CURIe. I would like to construct the triples with the object as a true CURIe or IRI.

@prefix source: <https://example.org/source> .
@prefix external: <https://example.org/external> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

source:sample1 source:external_identifiers "external:0110680"^^xsd:anyURI .
  • IRI(?o) returns nothing.
  • IRI(str(?o)) returns <external:0110680>
    • but I want <https://example.org/external/0110680>
  • This question mentions tarql:expandPrefixedName, but when I try that (with the prefix or just as bare expandPrefixedName) I get the following error message in arq or GraphDB. I assume that's because the tarql functions aren't available in those tools?

MALFORMED QUERY: Lexical error at line 12, column 28. Encountered: '40' (40), after prefix "expandPrefixedName"

I would prefer to do this in SPARQL, but would also try a Python solution using something like rdflib.

Mark Miller
  • 3,011
  • 1
  • 14
  • 34
  • Answered by https://stackoverflow.com/questions/76964596/how-to-determine-or-filter-datatype-of-a-rdflib-literal – Mark Miller Aug 23 '23 at 22:38

2 Answers2

2

To convert it to an IRI, you could use:

BIND( IRI(REPLACE(STR(?o), "^external:", STR(external:))) AS ?o_iri ) .
  • REPLACE() replaces the string "external:" (i.e., the prefix label; ^ represents the beginning of the value) in STR(?o) with STR(external:)
  • STR(?o) converts ?o ("external:0110680"^^xsd:anyURI) to a string ("external:0110680")
  • STR(external:) takes the prefix IRI (<https://example.org/external>) and converts it to a string ("https://example.org/external")
  • IRI() converts the replaced string to an IRI

If you have a few different prefixes, you could use something like this:

{
  FILTER( STRSTARTS(STR(?o), "foo:") ) .
  BIND( IRI(REPLACE(STR(?o), "^foo:", STR(foo:))) AS ?o_iri ) .
}
UNION
{
  FILTER( STRSTARTS(STR(?o), "bar:") ) .
  BIND( IRI(REPLACE(STR(?o), "^bar:", STR(bar:))) AS ?o_iri ) .
}

(Instead of a FILTER, you could use IF inside the BIND.)

Another option could be COALESCE with nested IFs.

  • 1
    thanks! I like how you broke out the steps. unfortunately I have multiple prefixes to handle. apologies for oversimplifying. Should I use SPARQL `if`? I always find that stressful. – Mark Miller Aug 23 '23 at 22:08
  • I can easily include all of the prefix expansions in my data and/or SPARQL statement if there's a way to use that. – Mark Miller Aug 23 '23 at 22:09
  • 1
    @MarkMiller: I updated the answer with an example for multiple prefixes that doesn’t use nested `IF`s. –– I don’t think SPARQL allows a more elegant way than listing all prefix labels again in the `WHERE` clause somehow. If there were a way for SPARQL to print a prefix label as-is (instead of as string or as IRI), it should be possible to have one generic solution that works for all prefixes. But I don’t think SPARQL allows printing prefix labels. – Stefan - brox IT-Solutions Aug 24 '23 at 10:10
1

If your triple store supports backreferences, you can prepare the prefix mapping beforehand and replace them all without complicating the query itself:

BIND("b:part" AS ?curi)
BIND("^a: urn:a: ^b: urn:b:" AS ?prefixes)

BIND(CONCAT(?prefixes, "|", ?curi) AS ?src)
BIND(REPLACE(?src, "\\^([^:]*:) ([^ ]+).*\\|\\1", "|$2") AS ?replaced)
BIND(REPLACE(?replaced, "^.*\\|", "") AS ?result)

This prepares a string in the form (^prefix: URI)...|CURIE and looks for a prefix subsequently appearing in the CURIE, replacing it with the full URI. The characters ^ | are chosen as delimiters because they are disallowed in URIs. Lastly, cleanup is performed to remove the (rest of the) prefix mapping.

If no prefix is found, this will leave the CURIE as it is. In case you want to detect that, you can change "|$2" to "`$2" and "^.*\\|", "" to "^.*([|`])", "$1" ‒ the result will start with either | or `, ready for more checks.

IS4
  • 11,945
  • 2
  • 47
  • 86