1

I have an xhtml file that I'm attempting to transform such that:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN"    "http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>yada</title>
    <meta.....>
</head>
<body>
    <p>Something</p>
    <p>awesome</p>
</body>
</html>

becomes a

<title>yada</title>
<meta.....>

<p>Something</p>
<p>awesome</p>

The key thing that I'm getting at is that the <head> and <body> tags are removed from the document. I don't want to run this through sed or awk to remove them.

Everything that I've tried either has the whole thing in html or converts it all into pure text.

Background on problem: I've got a backup of my blog written in multimarkdown, I'm hoping to put them into different format but I need to get over this issue first.

Note: I started off with the identity template.

Mandaris
  • 11
  • 3
  • You need to be more specific. What should be included in the output? What should be excluded? I can attempt an answer, but it's hard to tell if it's right without more information. – Wayne Mar 16 '11 at 20:50

3 Answers3

1

something like this? (bear with me, its been ages since I've done XSL actively)

<xsl:for-each select="head">
  <xsl:copy-of select="."/>
</xsl:for-each>

<xsl:for-each select="body">
  <xsl:copy-of select="."/>
</xsl:for-each>
scunliffe
  • 62,582
  • 25
  • 126
  • 161
1

Sounds like you want the identity transform for everything below html and body:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/html|/html/head|/html/body">            
        <xsl:apply-templates/>
    </xsl:template>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
Wayne
  • 59,728
  • 15
  • 131
  • 126
1

Are you sure this isn't the usual namespace problem? Does the input really look like you showed us, or did you leave out the namespaces because you didn't realise they made all the difference?

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • I'm just starting off with xslt so I'm not familiar with what the "usual namespace problem" is. The tool that I'm using doesn't specify a namespace in the raw xhtml. – Mandaris Mar 16 '11 at 23:19
  • @mandaris: As example of Dr. Kay advice http://stackoverflow.com/questions/297239/why-doesnt-xpath-work-when-processing-an-xhtml-document-with-lxml-in-python –  Mar 17 '11 at 02:54
  • I see that the input has now been edited to include a namespace declaration. So yes, it's the usual namespace problem. When using XPath to select elements in a namespace, you need to use qualified names. – Michael Kay Mar 20 '11 at 21:38