0

I want to transform a XML file using XSLT. During the transformation, there are new attributes added to the output file which I can't get my head around.

Input XML file (abbr.):

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title/>
            </titleStmt>
            <publicationStmt>
                <publisher/>
            </publicationStmt>
            <sourceDesc>
                <p/>
            </sourceDesc>
        </fileDesc>
    </teiHeader>
    <text>
        <body>
            <pb n="1"/>
            <p xml:id="uuid_f770d0a9-277a-42d5-a759-a02b05a29a49">
                <lb xml:id="uuid_cf21b2a4-e5b2-4ced-a1b2-a4e5b29ced0a"/>This is 
                <lb xml:id="uuid_dffd4def-3f7a-4d5b-bd4d-ef3f7abd5bf9"/> <rs type="person">an example</rs>.
            </p>
        </body>
    </text>
</TEI>

XSLT stylesheet to remove unwanted whitespace between <lb/> and <rs> (2nd line in input file above) used with Saxon PE 9.9.1.7:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:tei="http://www.tei-c.org/ns/1.0"
  exclude-result-prefixes="xs"
  version="2.0">
  
  <xsl:output method="xml" version="1.0" indent="no"/>
    
    <xsl:template match="@* | node()" name="identity-copy">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
      </xsl:copy> 
    </xsl:template>
    
    <!-- lb followed by rs -->
    <xsl:template match="text()[preceding-sibling::*[1][self::*:lb[@*]]][following-sibling::*[1][self::*:rs[@*]]][string-length(normalize-space()) = 0]"/>
  
</xsl:stylesheet>

In the outputted XML file there are new attributes added to elements that I don't even address in my stylesheet. E. g. there is @status='draft' added to <revisionDesc>/<change> (not in the input example) and @part='N' to <p>. I could list more examples, but I think it's a general problem. How can I avoid this?

Thanks in advance!

l.surname
  • 55
  • 5
  • Does the real XML input reference a DTD with DOCTYPE? That way the XML parser might add attributes declared in there that provide default values. – Martin Honnen Jan 31 '22 at 13:08
  • There is a RNG- and a Schematron-schema embedded in the real XML input. However, these newly inserted default values are not defined there. – l.surname Jan 31 '22 at 13:17
  • @MartinHonnen I just tried the transformation without the schemas and you are right, it looks like it has something to do with them. Unfortunately, it is not an option for me to first remove the schemas, then transform the files and afterwards add the schemas again. Do you have any idea how I can adjust the schemas accordingly? – l.surname Jan 31 '22 at 13:22
  • How do you run the transformation with Saxon, from the command line? Can you show the command line arguments? Inside oXygen? With a particular oXygen predefined TEI transformation scenario where a DTD or schema with default attributes is picked up? – Martin Honnen Jan 31 '22 at 13:23
  • Inside oXygen. I just created a new transformation scenario with my stylesheet, named it, defined the paths and chose Saxon PE. I didn't change anything else and also didn't tick the box to use stylesheets from the xml-stylesheet declaration, because there are none. – l.surname Jan 31 '22 at 13:40
  • It might help us understand the cause of the problem if you show us a minimal but complete XML (i.e. with the RNG/Schematron) where that happens. Also, if the XSLT doesn't matter, reduce it to the identity transformation. – Martin Honnen Jan 31 '22 at 13:42

2 Answers2

2

If you're using Saxon, there is an option to suppress expansion of schema- or DTD-defined default attribute values. (Though it doesn't work with all XML parsers, some don't have this option). In the Oxygen "configure transformation scenario" dialog, it's shown with a checkbox 'Expand attribute defaults ("-expand")'.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
1

In the Oxygen "Relax NG" preferences page there is an "Add default attributes" checkbox which you can uncheck: https://www.oxygenxml.com/doc/versions/23.1/ug-editor/topics/relax-ng-preferences-page.html

Radu Coravu
  • 1,754
  • 1
  • 9
  • 8