1

I have an XML file and want to extract text in HTML, but it's empty when I do it. I am trying to get the text from the tag and it works just fine when I delete the beginning of the XML code and start the file with tag. Here is a beginning of an XML file:

<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:vg="http://www.vangoghletters.org/ns/">
    <teiHeader xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <fileDesc>
            <titleStmt>
                <title>book title</title>
            </titleStmt>
            <publicationStmt>
                <publisher>
                    <name> name of the publisher </name>
                </publisher>
                <date type="first" when="2021">2021</date>
                <availability status="restricted">
                    <licence target="http://creativecommons.org/licenses/by-nc-sa/4.0/ https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode">
                        <p>Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) </p>
                    </licence>
                </availability>
                <ptr target="http://vangoghletters.org/orig/let001"/>
            </publicationStmt>
            <sourceDesc>
                <vg:letDesc>
                    <vg:letIdentifier>
                        <idno type="jlb">001</idno>
                        <idno type="collectedletters">1</idno>
                        <idno type="brieven1990">001</idno>
                    </vg:letIdentifier>

                    <vg:letContents>
                        <p>book name, chapter</p>
                    </vg:letContents>
                    <note type="sourceStatus" xml:id="sourceStatus">
                        <p> handwriting </p>
                    </note>
                    <note type="additionalDetail" xml:id="additionalDetail">
                        <p> some text</p>
                    </note>
                </vg:letDesc>
            </sourceDesc>
        </fileDesc>
    </teiHeader>
    
    <text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <body>
            <div type="original" xml:lang="ka">
            
                <pb f="1r" n="1" xml:id="pb-orig-1r-1" facs="#zone-pb-1r-1"/>
                <lb n="2" xml:id="l-1"/>
                <ab>There <rs type="pers" key="320"><supplied reason="lost">ეს</supplied>[7125.1]არისთა</rs>,
                    <rs type="pers" key="1643">მეფისა </rs>
                    
                    <rs type="pers" key="838">ასუ<supplied reason="lost">რასტა</supplied>ნისათა</rs>,
                    ...

Here is my XQuery code:

declare function app:text_orig($node as node(), $model as map(*))
{
    for $resource in collection('/db/apps/oshki/data')
        for $i in $resource//div[@type="original"]/ab//text()
            return
            <p>  {$i} </p>
};

Any idea why this happens?

David Denenberg
  • 730
  • 4
  • 7
nina
  • 27
  • 4

2 Answers2

0

Your root-element <TEI is in a namespace with the uri: "http://www.tei-c.org/ns/1.0" and therefore your div is in this case also in that namespace. See i.e. this answer on how to use exist-db with namespaces

Siebe Jongebloed
  • 3,906
  • 2
  • 14
  • 19
0

Elements in the TEI vocabulary all come in an XML namespace, as indicated by the xmlns attribute - a reserved attribute used for declaring XML namespace bindings:

<TEI xmlns="http://www.tei-c.org/ns/1.0">

An XML-aware application such as eXist-db has various facilities for querying namespaced XML. Most commonly in XQuery, you will add a "namespace declaration" to your query's prolog, which binds the namespace URI to a namespace prefix:

declare namespace tei="http://www.tei-c.org/ns/1.0";

Then you can use the tei namespace prefix in your query:

//tei:div[@type="original"]/tei:ab

When you removed the <TEI> root element, you also stripped off the namespace binding on the inner elements. They appeared to eXist as if they were in the "empty" namespace - the default element namespace. This is why your query worked without specifying namespaces in that case.

Joe Wicentowski
  • 5,159
  • 16
  • 26
  • That was very useful, it worked. Thank you! – nina Jul 01 '21 at 08:28
  • Great to hear. Feel free to join the [eXist-db Community Slack](https://github.com/eXist-db/exist#:~:text=Slack%3A) and/or the [E-Editiones Slack](https://teipublisher.com/index.html#:~:text=discuss%20with%20us%20on%20the%20%23community%20room%20on%20the%20e-editiones%20slack%20or%20write%20to%20the%20mailing%20list.) for the TEI Publisher community. – Joe Wicentowski Jul 02 '21 at 21:05