0

I have a problem with writing an xQuery. I have a collection of letters that is in TEI and every person, organisation and place in the text has to be in <persName>, <placeName> or <orgName>. I also have an XML key list with entries like

<place xml:id="O_02">
    <placeName>
        <settlement>Kairo</settlement>
        <settlement>Cairo</settlement>
    </placeName>
    <link target="https://de.wikipedia.org/wiki/Kairo"/>
</place>

Most of the elements that are in the text are already annotated, but now I have to write an xQuery to find the parts of the texts in the letters, that have an entry in the KeyList but are not tagged with one of the elements above. I don't have any clue how to proceed with this problem. Thanks for your help in advance!

Leo Wörteler
  • 4,191
  • 13
  • 10
grillz
  • 9
  • 2

1 Answers1

0

I think that analyze-string can help, here is a prototype using XSLT 3:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:fn="http://www.w3.org/2005/xpath-functions"
    expand-text="yes"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:param name="key-list">
<place xml:id="O_02">
    <placeName>
        <settlement>Kairo</settlement>
        <settlement>Cairo</settlement>
    </placeName>
    <link target="https://de.wikipedia.org/wiki/Kairo"/>
</place>
  </xsl:param>

  <xsl:key name="ref" match="place/placeName/*" use="."/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="*[not(self::persName | self::placeName | self::orgName)]/text()">
      <xsl:apply-templates select="analyze-string(., '\p{L}+')" mode="wrap"/>
  </xsl:template>

  <xsl:template mode="wrap" match="fn:match[key('ref', ., $key-list)]">
      <xsl:element name="{key('ref', ., $key-list)/../node-name()}">{.}</xsl:element>
  </xsl:template>

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/bwe3bL

I don't have my XQuery hat on currently but analyze-string is supported in XQuery as well and using some recursive function(s) with switch to emulate XSLT's apply-templates is possible:

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method 'xml';

declare variable $key-list as document-node() := document {
<place xml:id="O_02">
    <placeName>
        <settlement>Kairo</settlement>
        <settlement>Cairo</settlement>
    </placeName>
    <link target="https://de.wikipedia.org/wiki/Kairo"/>
</place>    
};

declare function local:key($value, $key-list as document-node()) as element()? {
  $key-list/place/placeName[settlement = $value]  
};

declare function local:apply-templates($nodes as node()*) as node()* {
  for $node in $nodes
  return typeswitch ($node)
        case element() return element { node-name($node) } { local:apply-templates($node!(@*, node())) }
        case text() return 
            if (not($node[parent::placeName | parent::persName | parent::orgName]))
            then local:wrap(analyze-string($node, '\p{L}+'))
            else $node
        default return $node
};

declare function local:wrap($nodes as node()*) as node()* {
   for $node in $nodes
   return typeswitch($node)
        case element(fn:match) return
            let $ref := local:key($node, $key-list)
            return
                if ($ref)
                then element { node-name($ref) } { data($ref) }
                else text { data($node) }
        case text() return $node
        default return local:wrap($node/node())
};

local:apply-templates(node())

https://xqueryfiddle.liberty-development.net/3Nzd8bN

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110