2

I have scoured the forums and and still not clear on this. I'm very new to the topic.

I have HTML output that is being sent to a browser email client (Outlook). Outlook overwrites the characteristics of the <p> tag and introduces large spacing.

I would like to set up a template-match to replace all <p> tags with <div> or <span>.

For complicated reasons which will not be addressed in this post, I cannot stop the HTML from being rendered with <p> tags in it.

So lets say that I have:

<p xmlns="http://www.w3.org/1999/xhtml">
   <span>Some text</span>
</p>

I would want the output to be

<span>Some text</span>

with the <p> tags removed.

If I have

<p xmlns="http://www.w3.org/1999/xhtml">
  <b>Some other text</b>
</p>

then I would be happy with either:

<b>Some other text</b>

or

<span>
   <b>Some other text</b>
</span>

Just as long as it gets rid of the <p> tags.

It would also need to recognize <p> without any attributes.

I thought about something like

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
</xsl:template>
<xsl:template match="p">
    <span>
        <xsl:apply-templates select="@*|node()" />
    </span>
</xsl:template>

but this does not work. The <p> tags still appear.

It is possible to write an adapter which will intercept the HTML before it is sent to the smtp server and manipulate it, but there are considerable difficulties in this approach which I am looking to avoid.

Is it even possible to do what I am attempting? Any help greatly appreciated.

solardb
  • 21
  • 2
  • Did you perhaps tag this question with `xsl-fo` by mistake? XSL-FO is different from XSLT. – Mathias Müller Jan 20 '16 at 09:14
  • The system which generates the input XML is xsl-fo based ... but you are right, that is irrelevant to this question. Removing the tag. Cheers – solardb Jan 21 '16 at 04:28

1 Answers1

1

Your input documents, this sample for instance:

<p xmlns="http://www.w3.org/1999/xhtml">
   <span>Some text</span>
</p>

Have a default namespace. And that's a good thing because a valid XHTML document must be in a namespace.

This means that this namespace applies to all elements in the document by default, and you have to account for this in your XSLT stylesheet. Redeclare this namespace there:

<xsl:stylesheet xmlns:xhtml="http://www.w3.org/1999/xhtml">

and whenever you make a reference to an element from the input document, prefix the element with xhtml::

<xsl:template match="xhtml:p">
    <span>
        <xsl:apply-templates select="@*|node()" />
    </span>
</xsl:template>

This should get you started. You have not told us yet if the output document should also be in a namespace or not.

Currently, only modifying the two templates you already have, a structure like

<p>
  <span/>
</p>

will end up as

<span>
  <span/>
</span>

Is this acceptable for you? If not, there must be an additional rule (template) about p elements that contain span elements:

<xsl:template match="xhtml:p[xhtml:span]">

or perhaps

<xsl:template match="xhtml:p[//xhtml:span]">
Mathias Müller
  • 22,203
  • 13
  • 58
  • 75
  • Thanks. The input HTML I am working with is auto-generated by a different system and cannot be changed. Unfortunately this system is forcefully adding paragraph tags which we do not want. I am very interested when you say that the p tags will change to span tags . That's exactly what I thought my template-match would do .. but it isn't doing it. The p tags are still there. Must be more going on that I'm not seeing. I will continue to investigate. Thanks – solardb Jan 21 '16 at 04:24
  • To answer your question, yes I would be fine with nested span tags. At this point I no longer care about doing things properly, as long as the p tags are gone - they are causing a headache in certain email readers because of how Outlook overwrites the CSS behavior for them. – solardb Jan 21 '16 at 04:34
  • @solardb So, did you try what I have suggested? Declaring the XHTML namespace in the stylesheet? I am not suggesting that you change the input document. – Mathias Müller Jan 21 '16 at 08:10
  • I have xsl:stylesheet xmlns:xsl="http:/ /www.w3.org/1999/XSL/Transform" and I also have the html tag having html xmlns="http:/ /www.w3.org/1999/xhtml" I will try changing that. (without the whitespace between the forward slashes) – solardb Jan 22 '16 at 06:09
  • @solardb A stylesheet with `http:/ /www.w3.org/1999/XSL/Transform` (with a whitespace in it) will cause an error - why do you have whitespace in there? Same goes for the XHTML namespace. Those namespace URIs have meaning and cannot be changed. [Click here](http://xsltransform.net/jyRYYiK) for a complete example of the transformation. – Mathias Müller Jan 22 '16 at 09:48
  • As i said in my comment - without the whitespace. The whitespace was added so I could post the URL properly from my phone. The namespace isn't the issue. The issue is removing the p tags. – solardb Jan 23 '16 at 12:04
  • @solardb You did not tell me that the whitespace was added by your phone. I'd like to ask you one last time: did you read and try what I suggest above, and look at the complete example I have provided in the comments? If it still does not work I suggest you give _more_ information: a complete sample of the input and your complete XSLT stylesheet. – Mathias Müller Jan 23 '16 at 12:39