0

I'm trying to secure a piece of code from XXE attack. The code uses FOP library and mimeFormat is application/pdf.

The original code works well:

    protected static void transformTo(Result result, Source src, String mimeFormat, String sFileNameXsl)
            throws FOPException {
        try {
            TransformerFactory factory = TransformerFactory.newInstance();

            File myXslFile = new File(sFileNameXsl);
            StreamSource xsltSource = new StreamSource(myXslFile);
            Transformer transformer = factory.newTransformer(xsltSource);
            transformer.setParameter("fop-output-format", mimeFormat);
            transformer.transform(src, result);
        } catch (Exception e) {
            throw new FOPException(e);
        }
    }

The application uses the Apache implementation from xalan-2.7.2 org.apache.xalan.processor.TransformerFactoryImpl.

As I tried to disable external DTD and stylesheet, I was forced to switch implemenation due to error "not supported property accessExternalDTD".

So I changed the code to use JDK8 implementation com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl :

TransformerFactory factory = TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", ClassLoader.getSystemClassLoader());
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");

At this point, the properties were supported, but a new message appeared:

FATAL ERROR: Cannot convert data type 'int' in 'node-set'.

The message is unexpected since I didn't change any structure or processing, just tried to secure the transformer with some properties.

The xslt is a modified version of this one https://www.antennahouse.com/hubfs/uploads/XSL%20Sample/xhtml2fo.xsl?hsLang=en

The error on data types is due to a misuse of | operator such as:

<xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>

<xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">

I managed to make things work by giving up the uppercase matchers:

<xsl:variable name="numcolumns" select="count(./html:tr/*)"/>

<xsl:if test="ancestor::html:table[1]/@rules = 'cols'">

So ultimately the question is, how do I match both uppercase and lowercase in these matchers ?

Sybuser
  • 735
  • 10
  • 27
  • I don't understand. First you tell us that you switched from the Apache processor to the JDK8 processor, then you tell us that you didn't change anything. I think you need to forget the history and focus on debugging the new error message from first principles. You haven't given us any information that we can use to help you do this. – Michael Kay Aug 02 '23 at 15:29
  • But there is an alternative: stick with Apache, and instead of using the ACCESS_EXTERNAL_DTD property, write your own EntityResolver to prevent external access. This will be perfectly safe, the only problem might be that the security inspection tools aren't smart enough to recognise it as safe. – Michael Kay Aug 02 '23 at 15:35
  • What I meant to say is that I didn't change the structure of the input and the processing of the xslt. – Sybuser Aug 02 '23 at 16:44
  • Try to get a line number where the error occurs. Also, does the XML input use a DTD so that the input changes if you set that property XMLConstants.ACCESS_EXTERNAL_DTD? – Martin Honnen Aug 02 '23 at 21:50
  • I've put an exception breakpoint on TypeCheckError and I found expression `union(funcall(count, [ParentLocationPath(step("child", 40), step("child", 1))]), funcall(count, [ParentLocationPath(step("child", 81), step("child", 1))]))` which makes me think it is this one ``... OR operator on count doesn't seem to make sense, right? – Sybuser Aug 03 '23 at 08:44
  • also this seems invalid – Sybuser Aug 03 '23 at 08:59
  • I agree that `count(./html:tr/*)|count(./html:TR/*)` looks wrong, a union `|` works for node-sets, not for numbers. However, I don't find that code in the stylesheet you linked to. – Martin Honnen Aug 03 '23 at 10:31
  • Indeed I shared a public version, I didn't realize they modified it at our end – Sybuser Aug 03 '23 at 10:47
  • correct syntax seems to be this one `` and `` – Sybuser Aug 03 '23 at 15:53

1 Answers1

0

The issue is a tolerance of syntax for union expressions which exists in the Apache parser but not the JDK parser.

The following union expressions :

<xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>
<xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">

had to be rewritten to :

<xsl:variable name="numcolumns" select="count(./html:tr/*|./html:TR/*)"/>
<xsl:if test="(ancestor::html:table|ancestor::html:TABLE)[1]/@rules = 'cols'">
Sybuser
  • 735
  • 10
  • 27